2010-06-03 19:31:29

by Ingo Molnar

[permalink] [raw]
Subject: suspend blockers & Android integration


* [email protected] <[email protected]> wrote:

> [...] Not only has the source code been made available, but hundreds of
> engineering hours have been made trying to accomodate the demands of LKML
> --- and LKML has said no to suspend blockers/wakelocks.

I dont think you are being fair here, at all.

Firstly, the suspend-blockers feature is not being rejected (fixing and
extending suspend is a worthwile goal), it's just that various different
schemes have been proposed by the people who'll eventually have to maintain
that code down the line.

Those reasons seem justified and they are based in praxis that have solved
similar problems to what Android tries to solve.

Sadly the response from the Android team has been 100% uncompromising: either
suspend blockers or nothing.

The thing is, if the insertion of 'hundreds of man hours' into discussing a
feature was technical grounds for upstream inclusion then we'd today have a
Linux kernel with:

- STREAMS
- a kABI
- modularized ipv4
- perfmon
- two dozen CPU schedulers
- zero-copy stupidly pushed to all the file APIs

... and IMO we'd be off much worse technically.

Lets realize it, Linux is an engineering effort that has literally cost about
ten thousand man years. That's about a _85 million_ man hours. It takes effort
to keep that kind of work valuable!

Also, why did the Android team start its contributions with such a difficult
and controversial kernel feature?

There is absolutely _zero_ technical reason why the Android team should
present this as as an all-or-nothing effort. Why not merge hw drivers first
(with suspend blockers commented or stubbed out), to reduce the fork distance?

Really, i myself have controversial kernel features pending all the time. They
dont go upstream for a few kernel releases - over a year sometimes - and
sometimes they never go upstream.

But the fact that some feature of mine is pending doesnt give me the right to
go away sulking, it doesnt mean i will block the whole flow of patches in
retaliation (as you seem to suggest Google will now have the right to do) - i
simply try to work it out.

Lets be reasonable and work it all out, ok?

Thanks,

Ingo


2010-06-03 19:50:14

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, Jun 3, 2010 at 12:30 PM, Ingo Molnar <[email protected]> wrote:
>
> Sadly the response from the Android team has been 100% uncompromising: either
> suspend blockers or nothing.

Well, we're willing to accept something that gives us the same
functionality (thus rewriting the api several times to meet various
objections, current discussions around
constraint-based-implementations / pm-qos, etc). We believe we're
solving a real problem here and have not seen a counter-proposal that
accomplishes the same.

Suggestions such as "just yell at developers for writing bad apps" or
"it's the user's fault if they install a lousy app" or "make your app
marketplace more restrictive" are not helpful. The technical
discussions around alternatives are more so (though I do feel like
we're going in circles in places), which again is why we're still here
talking about this (that and Arve is about a billion times more
patient and persistent than I am).

We're not interested in massively rearchitecting our userspace to
accomplish this (and the "rewrite your userspace!" proposals I've seen
have had race conditions and/or significant more complexity than the
wakelock model).

...

> Also, why did the Android team start its contributions with such a difficult
> and controversial kernel feature?

We started here because it's possibly the only api level change we
have -- almost everything else is driver or subarch type work or
controversial but entirely self-contained (like the binder, which I
would be shocked to see ever hit mainline). Assertions have been made
that because the "android kernel" (not a term I like -- linux is
linux, we have some assorted patches on top) has this feature it
represents a difficulty for silicon vendors trying to support both
Android projects and OEMs and mainline:

See: http://www.kroah.com/log/linux/android-kernel-problems.html and
various other rants about the evil terrible android forks, etc.

So, we figure, let's sort out the hard problem first and then move on
with our lives.

> There is absolutely _zero_ technical reason why the Android team should
> present this as as an all-or-nothing effort. Why not merge hw drivers first
> (with suspend blockers commented or stubbed out), to reduce the fork distance?

If that's the case then there is no problem and people could stop
yelling at us and just submit their drivers. Awesome.

I can't speak for all the nameless silicon vendors Greg represents,
that we apparently are preventing from doing this (how? I don't
know!), etc, but for my team maintaining multiple versions of drivers
is a headache, we'd rather square away the wakelock debate first and
figure something out there, as it just seems like a more logical
approach. Maybe we're crazy.

> Really, i myself have controversial kernel features pending all the time. They
> dont go upstream for a few kernel releases - over a year sometimes - and
> sometimes they never go upstream.
>
> But the fact that some feature of mine is pending doesnt give me the right to
> go away sulking, it doesnt mean i will block the whole flow of patches in
> retaliation (as you seem to suggest Google will now have the right to do) - i
> simply try to work it out.

We're not blocking anything. Hell, if people want drivers we wrote
upstream and we're not fast enough for 'em, we publish everything via
android.git.kernel.org, pretty aggressively rebase to follow latest
mainline, and release everything under GPLv2, ready-to-go. We have to
ship though, and as long as the version we maintain has the features
we need to ship and the mainline version doesn't, we're going to ship
based on our version, but this really shouldn't be surprising to
anyone.

> Lets be reasonable and work it all out, ok?

We're trying.

I do feel like we're suffering from lack of a clear "how do we move
forward" path, and in particular from an environment where every time
we do a bunch of work to address one set of concerns and entirely new
set of people pop up with different concerns (sometimes contradicting
the last round of changes we were asked to make, etc, etc).

Brian

2010-06-03 23:23:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Ingo Molnar <[email protected]> wrote:

> * [email protected] <[email protected]> wrote:
>
> > [...] Not only has the source code been made available, but hundreds of
> > engineering hours have been made trying to accomodate the demands of LKML
> > --- and LKML has said no to suspend blockers/wakelocks.
>
> I dont think you are being fair here, at all.
>
> Firstly, the suspend-blockers feature is not being rejected (fixing and
> extending suspend is a worthwile goal), it's just that various different
> schemes have been proposed by the people who'll eventually have to maintain
> that code down the line.

Btw., i'd like to summarize the scheduler based suspend scheme proposed by
Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in
the big thread, and there are also new elements of the proposal:

- Create a 'deep idle' mode that suspends. This, if all constraints
are met, is triggered by the scheduler automatically: just like the other
idle modes are triggered currently. This approach fixes the wakeup
races because an incoming wakeup event will set need_resched() and
abort the suspend.

( This mode can even use the existing suspend code to bring stuff down,
therefore it also solves the pending timer problem and works even on
PC style x86. )

- Introduce a 'minimum wakeup latency' task attribute (task->latency),
settable via a scheduler syscall. This is an ABI that influences the kernel
how idle the system can go. (i.e. the equivalent of suspend blockers, just
not binary and not system-wide.)

- Solve crappy app confinement via the scheduler:

A first proposal was to use the existing cgroup mechanism, but we found
a different and probably more elegant solution:

We can slightly extend the scheduler and introduce another per task 'minimum
latency other tasks are allowed to run' scheduling attribute
(task->exclude_latency) - set via a scheduler syscall as well. (only
settable by privileged tasks - such as the screensaver.)

This allows a task to 'exclude' other tasks that dont have low-latency
requirements. Crappy apps would have a large latency value, so they'd
be idled out when a privileged task sets the exclusion level low enough.

In the case of Android, this would for example be used by the screensaver
to introduce different levels of runnability/idling.

[ Note that this scheme would also be useful in a completely different
scenario, for real-time tasks as well: it would allow extreme-RT tasks to
quiescence all lower prio tasks in a controlled manner. (even if the RT
task is sleeping) ]

- Controlled auto-suspend: drivers (such as input) could on wakeup
automatically set the 'minimum wakeup latency' value of wakee tasks to a
lower value. This automatically prevents another auto-suspend in the near
future: up to the point the wakee task increases its latency (via the
scheduler syscall) again and allows suspend again.

This means there will be no surprise suspends for a task that may take a
bit longer than usual to finish its work. [ Detail: this would only be done
for tasks that have a non-default (non-infinity) task->latency value - to
prevent the input driver from lowering latency values (and preventing
future suspends) just because some unaware apps are running and using input
drivers. ]

All in one, this scheme allows everything without exception that
suspend-blockers allows and supports all the important usecases:

- allows agressive auto-idling

- has no wakeup races

- allows crappy-app confinement and other finegrained suspend control

- it should be pretty easy to adopt by Android as well, as it goes
along similar principles of kernel automatisms combined with
user-space controlled task and system attributes.

It's straightforward to adapt and it is also more generic, more clean and more
flexible than suspend-blockers.

Please mention any remaining technical issues that may still be are
unaddressed.

Thanks,

Ingo

2010-06-03 23:43:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: suspend blockers & Android integration



On Fri, 4 Jun 2010, Ingo Molnar wrote:
>
> This allows a task to 'exclude' other tasks that dont have low-latency
> requirements. Crappy apps would have a large latency value, so they'd
> be idled out when a privileged task sets the exclusion level low enough.

Quite frankly, this sounds fundamentally broken.

Think deadlock. The high-latency task got a lock, and now you're excluding
it because it scheduled away.

So from my perspective, putting that kind of logic deep in the system
sounds like the _last_ thing we want to do.

I think it's much saner to have a very targeted suspend blocker that only
blocks the opportunistic suspends and has _zero_ interaction with the rest
of the system (certainly none at all with core code like the scheduler).

And if somebody then suspends the traditional way (by an actual suspend
event, not that opportunistic thing), then the suspend blocker does
nothing at all - because it simply doesn't even _exist_ at that level.
It's only about the opportunistic suspends.

(I'd further suggest that disk wait and running in kernel mode disable any
opportunistic suspend anyway - but that's not about suspend blockers as
much as it is about just the opportunistic suspend itself).

Linus

2010-06-03 23:47:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Linus Torvalds <[email protected]> wrote:

> On Fri, 4 Jun 2010, Ingo Molnar wrote:
> >
> > This allows a task to 'exclude' other tasks that dont have low-latency
> > requirements. Crappy apps would have a large latency value, so they'd
> > be idled out when a privileged task sets the exclusion level low enough.
>
> Quite frankly, this sounds fundamentally broken.
>
> Think deadlock. The high-latency task got a lock, and now you're excluding
> it because it scheduled away.

Mail was a bit too long already so i trimmed it at the wrong place :-/

What you say is absolutely true, hence this would be driven via sched_tick() +
TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can
be done with no overhead to the regular fastpaths.

The TIF notifier would be the one scheduling to idle - and would thus do it
only to user-mode tasks.

Thanks,

Ingo

2010-06-04 00:40:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Ingo Molnar <[email protected]> wrote:

> - Create a 'deep idle' mode that suspends. This, if all constraints
> are met, is triggered by the scheduler automatically: just like the other
> idle modes are triggered currently. This approach fixes the wakeup
> races because an incoming wakeup event will set need_resched() and
> abort the suspend.
>
> ( This mode can even use the existing suspend code to bring stuff down,
> therefore it also solves the pending timer problem and works even on
> PC style x86. )

Note that this does not necessarily have to be implemented as 'execute suspend
from the idle task' code: scheduling from the idle task, while can certainly
be made to work, is a somewhat recursive concept that we might want to avoid
for robustness reasons.

Instead, the 'deepest idle' (suspend) method could consist of a wakeup of a
kernel thread (or of any of the existing kernel threads such as the migration
thread) - which kernel thread then does a race-free suspend: it offlines all
but one CPU [on platforms that need that] and then initiates the suspend - but
aborts the attempt if there's any sign of wakeup activity.

Thanks,

Ingo

2010-06-04 02:22:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: suspend blockers & Android integration



On Fri, 4 Jun 2010, Ingo Molnar wrote:
>
> What you say is absolutely true, hence this would be driven via sched_tick() +
> TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can
> be done with no overhead to the regular fastpaths.
>
> The TIF notifier would be the one scheduling to idle - and would thus do it
> only to user-mode tasks.

The thing is, unless there is some _really_ deep other reason to do
something like this, I still think it's total overdesign to push any
knowledge/choices like this into the scheduler. I'd rather keep things way
more independent, less tied to each other and to deep kernel subsystems.

IOW, my personal opinion is that somethng like a suspend (blocker or not)
decision simply shouldn't be important enough to be tied into the
scheduler. Especially not if it could just be its own layer.

That said, as far as I know, the Android people have mostly been looking
at the suspend angle from a single-core standpoint. And I'm not at all
convinced that they should hijack the existing "/sys/power/state" thing
which is what I think they do now.

And those two things go together. The /sys/power/state thing is a global
suspend - which I don't think is appropriate for a opportunistic thing in
the first place, especially for multi-core.

A well-designed opportunistic suspend should be a two-phase thing: an
opportunistc CPU hotunplug (shutting down cores one by one as the system
is idle), and not a "global" event in the first place. And only when
you've reached single-core state should you then say "do I suspend the
system too".

So I've tried to look a bit at the patches, and my admittedly rough
comments so far is

- I really do prefer the "off to the side" approach that the current
google opportunistic suspend patches have. As mentioned, I don't think
this should be deep in the scheduler. Not at all.

- I do think there are possibly races and CPU idle issues there, but I
think they are mainly for the multi-core thing. And I think that's a
totally separate issue. Or it _should_ be.

- once you're single-core (whether because you never had more cores to
begin with, or because the "opportunistic CPU offlining" has taken down
the other cores), I think the suspend-blocker is fine as a concept, and
certainly shouldn't need any deep scheduler hooks.

so I'd like to see the opportunistc suspend thing think about CPU
offlining, and I'd like to see it disconnect from the existing
/sys/power/state. And I'd really not like to involved deep internal kernel
hooks into it.

But I'll also admit that maybe I'm not seeing some problems. I've frankly
tried to avoid the whole discussion until Andrew pulled me in yesterday.

Linus

2010-06-04 02:32:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: suspend blockers & Android integration



On Thu, 3 Jun 2010, Linus Torvalds wrote:
>
> so I'd like to see the opportunistc suspend thing think about CPU
> offlining

Side note: one reason for me being somewhat interested in the CPU
offlining is that I think the Android kind of opportunistic suspend is
_not_ likely something I'd like to see on a desktop. But an the
"opportunistic CPU offliner"? That might _well_ be useful even outside of
any other suspend activity.

If the system is idle (or almost idle) for long times, I would heartily
recommend actively shutting down unused cores. Some CPU's are hopefully
smart enough to not even need that kind of software management, but I
suspect even the really smart ones might be able to take advantage of the
kernel saying: "I'm shutting you down, you don't have to worry about
latency AT ALL, because I'm keeping another CPU active to do any real
work".

I'd also be interested to see if it could even improve single-thread
performance if we end up doing the whole SMP->UP "lock" prefix rewriting
when the system is idle enough that we'd be better off running just a
single core. I dunno - just throwing that out there.

Anyway, the only reason I think this is related is literally because I
think that if we know there is only a single CPU active, I think the
actual "real" opportunistic suspend is easier. Suddenly you don't have to
worry about what happens on other run-queues etc, and whether another CPU
is just about to create a suspend block etc.

So I think they tie together, although it's mostly tangential. And as
mentioned, I think a opportunistic CPU suspend part is more relevant
outside of Android, and thus perhaps more widely interesting.

Linus

2010-06-04 03:43:07

by Arjan van de Ven

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, 3 Jun 2010 19:26:50 -0700 (PDT)
Linus Torvalds <[email protected]> wrote:

>
> If the system is idle (or almost idle) for long times, I would
> heartily recommend actively shutting down unused cores. Some CPU's
> are hopefully smart enough to not even need that kind of software
> management, but I suspect even the really smart ones might be able to
> take advantage of the kernel saying: "I'm shutting you down, you
> don't have to worry about latency AT ALL, because I'm keeping another
> CPU active to do any real work".

sadly the reality is that "offline" is actually the same as "deepest C
state". At best.

As far as I can see, this is at least true for all Intel and AMD cpus.

And because there's then no power saving (but a performance cost), it's
actually a negative for battery life/total energy.

(lots of experiments inside Intel seem to confirm that, it's not just
theory)





--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-06-04 03:46:04

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, Jun 3, 2010 at 7:16 PM, Linus Torvalds
<[email protected]> wrote:
>
>
> On Fri, 4 Jun 2010, Ingo Molnar wrote:
>>
>> What you say is absolutely true, hence this would be driven via sched_tick() +
>> TIF notifiers - i.e. only ever treat user-mode tasks as 'idle-able'. This can
>> be done with no overhead to the regular fastpaths.
>>
>> The TIF notifier would be the one scheduling to idle - and would thus do it
>> only to user-mode tasks.
>
> The thing is, unless there is some _really_ deep other reason to do
> something like this, I still think it's total overdesign to push any
> knowledge/choices like this into the scheduler. I'd rather keep things way
> more independent, less tied to each other and to deep kernel subsystems.
>
> IOW, my personal opinion is that somethng like a suspend (blocker or not)
> decision simply shouldn't be important enough to be tied into the
> scheduler. Especially not if it could just be its own layer.
>
> That said, as far as I know, the Android people have mostly been looking
> at the suspend angle from a single-core standpoint. And I'm not at all
> convinced that they should hijack the existing "/sys/power/state" thing
> which is what I think they do now.
>

While it is true that we have not used this code on a multi core
system yet, I'm not sure why multiple cores codes would affect it. We
annotate that works needs to be done before it is safe to suspend, but
we don't care which core does the work (or if multiple cores do pieces
of it).

> And those two things go together. The /sys/power/state thing is a global
> suspend - which I don't think is appropriate for a opportunistic thing in
> the first place, especially for multi-core.
>
> A well-designed opportunistic suspend should be a two-phase thing: an
> opportunistc CPU hotunplug (shutting down cores one by one as the system
> is idle), and not a "global" event in the first place. And only when
> you've reached single-core state should you then say "do I suspend the
> system too".
>

This seems to fit better into the cpuidle and/or frequency scaling framework.

> So I've tried to look a bit at the patches, and my admittedly rough
> comments so far is
>
> ?- I really do prefer the "off to the side" approach that the current
> ? google opportunistic suspend patches have. As mentioned, I don't think
> ? this should be deep in the scheduler. Not at all.
>
> ?- I do think there are possibly races and CPU idle issues there, but I
> ? think they are mainly for the multi-core thing. And I think that's a
> ? totally separate issue. Or it _should_ be.
>

I'm not aware of any races with multi-core systems unless there are
existing problems in suspend. We check if any suspend blockers are
active after disable_nonboot_cpus() has returned.

> ?- once you're single-core (whether because you never had more cores to
> ? begin with, or because the "opportunistic CPU offlining" has taken down
> ? the other cores), I think the suspend-blocker is fine as a concept, and
> ? certainly shouldn't need any deep scheduler hooks.
>
> so I'd like to see the opportunistc suspend thing think about CPU
> offlining,

I see this as a separate problem. We ignore a single busy CPU for
opportunistic suspend, so why should the number of online CPUs matter?

> and I'd like to see it disconnect from the existing
> /sys/power/state.

The entry point is not important to us. The current interface is what
Rafael wanted instead of the /sys/power/request-state interface which
is what we changed it to last year.

> And I'd really not like to involved deep internal kernel
> hooks into it.
>
> But I'll also admit that maybe I'm not seeing some problems. I've frankly
> tried to avoid the whole discussion until Andrew pulled me in yesterday.
>
> ? ? ? ? ? ? ? ? ? ? ? ?Linus
>



--
Arve Hj?nnev?g

2010-06-04 04:38:47

by NeilBrown

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 4 Jun 2010 01:23:02 +0200
Ingo Molnar <[email protected]> wrote:

> Btw., i'd like to summarize the scheduler based suspend scheme proposed by
> Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in
> the big thread, and there are also new elements of the proposal:

Hi
I would like to summarise the alternate proposal that I an others have
suggested in a variety of different forms.

It starts from the premise that
1/ Android developers actually like the "big hammer" aspect of suspend.
Initiating suspend powers down some devices, puts others in low power
states, freezes all processes and generally puts the device to sleep
with a well defined and easily controlled (at the whole-of-system level)
set of events that will wake from suspend. This is a big part of the
Android approach to power-saving and I'm guessing they are not keen to
depart from it.

2/ The main problem with using suspend as-is is that it is racy.
The purpose of suspend is to put the device to sleep until a wake-event
occurs. When that wake-event occurs at much the same time that suspend is
requested races can occur. We want a wake-event to not only wake the
device, to be keep the device awake while the wake-event is being handled,
and to cancel any suspend that was initiated before the wake event
completed.
We need to understand "wake event" in an holistic sense. If a key press is
expected to brighten the screen and make a glyph appear, and if that key
press is considered to be a wake-event, then the glyph appearing must also
be a part of the wake event. For such a holistic wake-event to fully
block/cancel a suspend there much be some mechanism for hand-over of
wake-events from kernel-space to user-space.

Given those premises, google's suspend-blocker approach was to allow a
kernel thread to initiate suspend whenever nothing was stopping it, and to
allow both drivers and user processes to block that suspend while handling
a wake event (or anything else that needed to keep the device awake).
In this case the hand-over is fairly straight forward as the kernel thread
as full knowledge and can easily wait for all sorts of things.

The alternate proposal is simply to have user-space initiate a suspend (as
is already possible), user-space processes can then trivially block that
suspend through any of a number of IPC approaches, and kernel space drivers
can block/abort suspend by explicitly requesting a block.

The variety of alternate proposals comes from a variety of ways to modify
the semantics of "ask for a suspend" in such a way that userspace can
discover when there are kernel-space blocks, and can wait for them to be
released without spinning.

A sample modification (which I think is different to all the ones
mentioned so far, and hopefully pulls out the best of them all) is
to allow userspace to write e.g. "mem_safe" rather than "mem" to
/sys/power/state. The 'safe' implies it is safe from races.

When this is written, the process sleeps in an interruptible state until
all in-kernel suspend blocks have been dropped. If any such suspend blocks
were found, or if a signal is received, the request aborts. Only if there
were no suspend blocks and no pending signals does the suspend progress.

wake-events in the kernel then need to be tracked all the way to user-space,
and the in-kernel lock is only dropped when the event is consumed by
user-space. User-space must take some sort of lock to ensure no new
suspend is requested before consuming any wake-events from the kernel.

I believe this is very close to what android has today, only with a much
smaller change to the user-space interface, which I believe to be the thing
that has been found most objectionable.
I does still require a degree of event-tracking within the kernel which
might still be objectionable - I'm not so sure about different people's
positions on that.

Thanks,
NeilBrown

2010-06-04 04:51:59

by Linus Torvalds

[permalink] [raw]
Subject: Re: suspend blockers & Android integration



On Thu, 3 Jun 2010, Arjan van de Ven wrote:
>
> And because there's then no power saving (but a performance cost), it's
> actually a negative for battery life/total energy.

Including the UP optimizations we do (ie lock prefix removal)? It's
possible that I'm just biased by benchmarks, and it's true that Intel has
been getting lots better, but the locking costs are very noticeable
performance-wise on some benchmarks.

And several CPU's have been held back from going into deepest sleep states
by stupid firmware and/or platform bugs.

But hey, if it's not going to help, and people have tried it, I guess I'll
have to believe it.

Linus

2010-06-04 04:54:52

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar <[email protected]> wrote:
...
> ?- Controlled auto-suspend: drivers (such as input) could on wakeup
> ? automatically set the 'minimum wakeup latency' value of wakee tasks to a
> ? lower value. This automatically prevents another auto-suspend in the near
> ? future: up to the point the wakee task increases its latency (via the
> ? scheduler syscall) again and allows suspend again.
>

How do you clear the latency value in a safe way? If another wakeup
event happens right after your wakee task is done processing the last
event and decides to increase its latency, auto suspend will be
allowed even though you have an unprocessed wakeup event. Also how do
you know which task will read the event if it is not already waiting
for it?


> ? This means there will be no surprise suspends for a task that may take a
> ? bit longer than usual to finish its work. [ Detail: this would only be done
> ? for tasks that have a non-default (non-infinity) task->latency value - to
> ? prevent the input driver from lowering latency values (and preventing
> ? future suspends) just because some unaware apps are running and using input
> ? drivers. ]

Don't you need two inifinity values for this?

--
Arve Hj?nnev?g

2010-06-04 06:22:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Linus Torvalds <[email protected]> wrote:

> [...]
>
> And those two things go together. The /sys/power/state thing is a global
> suspend - which I don't think is appropriate for a opportunistic thing in
> the first place, especially for multi-core.
>
> A well-designed opportunistic suspend should be a two-phase thing: an
> opportunistc CPU hotunplug (shutting down cores one by one as the system is
> idle), and not a "global" event in the first place. And only when you've
> reached single-core state should you then say "do I suspend the system too".

Shutting a core down would be a natural idle level, and when the last one goes
idle we can do the suspend. (it happens as part of suspend anyway)

So on systems that dont want to auto-suspend this would indeed behave like you
suggest: the final core left would run as UP in essence.

Ingo

2010-06-04 07:14:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Arve Hj?nnev?g <[email protected]> wrote:

> On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar <[email protected]> wrote:
> ...
> > ?- Controlled auto-suspend: drivers (such as input) could on wakeup
> > ? automatically set the 'minimum wakeup latency' value of wakee tasks to a
> > ? lower value. This automatically prevents another auto-suspend in the near
> > ? future: up to the point the wakee task increases its latency (via the
> > ? scheduler syscall) again and allows suspend again.
> >
>
> How do you clear the latency value in a safe way? If another wakeup event
> happens right after your wakee task is done processing the last event and
> decides to increase its latency, auto suspend will be allowed even though
> you have an unprocessed wakeup event. Also how do you know which task will
> read the event if it is not already waiting for it?

The easiest solution would be to not do any of that initially. (If it's ever a
concern we could subtract/add without destroying the nesting property)

Why do you need to track input wakeups? It's rather fragile and rather
unnecessary - the idle drivers know it very well how to not go into the
deepest idle mode already today. We wont hit C8 on laptops when you are using
the desktop.

> > ? This means there will be no surprise suspends for a task that may take a
> > ? bit longer than usual to finish its work. [ Detail: this would only be done
> > ? for tasks that have a non-default (non-infinity) task->latency value - to
> > ? prevent the input driver from lowering latency values (and preventing
> > ? future suspends) just because some unaware apps are running and using input
> > ? drivers. ]
>
> Don't you need two inifinity values for this?

Yes - any value above the max idle latency in the system will do.

Thanks,

Ingo

2010-06-04 07:37:49

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 12:13 AM, Ingo Molnar <[email protected]> wrote:
>
> * Arve Hj?nnev?g <[email protected]> wrote:
>
>> On Thu, Jun 3, 2010 at 4:23 PM, Ingo Molnar <[email protected]> wrote:
>> ...
>> > ?- Controlled auto-suspend: drivers (such as input) could on wakeup
>> > ? automatically set the 'minimum wakeup latency' value of wakee tasks to a
>> > ? lower value. This automatically prevents another auto-suspend in the near
>> > ? future: up to the point the wakee task increases its latency (via the
>> > ? scheduler syscall) again and allows suspend again.
>> >
>>
>> How do you clear the latency value in a safe way? If another wakeup event
>> happens right after your wakee task is done processing the last event and
>> decides to increase its latency, auto suspend will be allowed even though
>> you have an unprocessed wakeup event. Also how do you know which task will
>> read the event if it is not already waiting for it?
>
> The easiest solution would be to not do any of that initially. (If it's ever a
> concern we could subtract/add without destroying the nesting property)
>
> Why do you need to track input wakeups? It's rather fragile and rather

Because we have keys that should always turn the screen on, but the
problem is not specific to input events. If we enabled a wakeup event
it usually means we need this event to always work, not just when the
system is fully awake or fully suspended.

> unnecessary - the idle drivers know it very well how to not go into the
> deepest idle mode already today. We wont hit C8 on laptops when you are using
> the desktop.
>

The whole point allow the use of suspend.

>> > ? This means there will be no surprise suspends for a task that may take a
>> > ? bit longer than usual to finish its work. [ Detail: this would only be done
>> > ? for tasks that have a non-default (non-infinity) task->latency value - to
>> > ? prevent the input driver from lowering latency values (and preventing
>> > ? future suspends) just because some unaware apps are running and using input
>> > ? drivers. ]
>>
>> Don't you need two inifinity values for this?
>
> Yes - any value above the max idle latency in the system will do.
>
> Thanks,
>
> ? ? ? ?Ingo
>



--
Arve Hj?nnev?g

2010-06-04 07:58:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Brian Swetland <[email protected]> wrote:

> On Thu, Jun 3, 2010 at 12:30 PM, Ingo Molnar <[email protected]> wrote:
> >
> > Sadly the response from the Android team has been 100% uncompromising: either
> > suspend blockers or nothing.
>
> Well, we're willing to accept something that gives us the same
> functionality (thus rewriting the api several times to meet various
> objections, current discussions around
> constraint-based-implementations / pm-qos, etc). We believe we're
> solving a real problem here and have not seen a counter-proposal that
> accomplishes the same.
>
> Suggestions such as "just yell at developers for writing bad apps" or
> "it's the user's fault if they install a lousy app" or "make your app
> marketplace more restrictive" are not helpful. [...]

Agreed.

> [...] The technical discussions around alternatives are more so (though I
> do feel like we're going in circles in places), [...]

Yep.

> [...] which again is why we're still here talking about this (that and Arve
> is about a billion times more patient and persistent than I am).
>
> We're not interested in massively rearchitecting our userspace to accomplish
> this (and the "rewrite your userspace!" proposals I've seen have had race
> conditions and/or significant more complexity than the wakelock model).

Having a somewhat different ABI for achieving things you'll probably have
prepare for. I doubt it would result in any large-scale, massive rewrites.

> ...
>
> > Also, why did the Android team start its contributions with such a
> > difficult and controversial kernel feature?
>
> We started here because it's possibly the only api level change we have --
> almost everything else is driver or subarch type work or controversial but
> entirely self-contained (like the binder, which I would be shocked to see
> ever hit mainline). [...]

So why arent those bits mainline? It's a 1000 times easier to get drivers and
small improvements and non-ABI changes upstream.

After basically two years of growing your fork (and some attempts to get your
drivers into drivers/staging/ - from where they have meanwhile dropped out
again) you re-started with the worst possible thing to merge: a big and
difficult kernel feature affecting many subsystems. Why?

This is one of the fundamental problems here. People simply dont know you,
because you have not worked with us much - and hence they dont trust you
positively out of box - they are neutral at best.

And believe me, it's hard enough to get difficult features upstream if people
_do_ know you and when they positively _do_ trust you ... Arent you talking to
Andrew Morton about how to do these things properly? This is kernel
contribution 101 really.

> [...] Assertions have been made that because the "android kernel" (not a
> term I like -- linux is linux, we have some assorted patches on top) [...]

I've been tracking android-common and android-msm for a while and i have to
say that it shows a very lackluster attitude towards upstream:

- The latest branches i can see are v2.6.32 based today. We are in the
v2.6.35 stabilization cycle and are developing v2.6.36. I.e. your upstream
base is about a year too old.

- The last commit is a couple of weeks old AFAICS.

- The diffstat of android-common/android-2.6.32 is:

890 files changed, 39962 insertions(+), 6286 deletions(-)

Those assorted patches have spread over nearly a thousand files. FYI, by
the looks of it you are facing an exponentially worsening maintenance
overhead curve here.

Is there perhaps some other tree i should be following? I'm looking at:

[remote "android-msm"]
url = git://android.git.kernel.org/kernel/msm.git
fetch = +refs/heads/*:refs/remotes/android-msm/*
[remote "android-common"]
url = git://android.git.kernel.org/kernel/common.git
fetch = +refs/heads/*:refs/remotes/android-common/*

Btw., the commits i've glanced at looked mostly clean and well structured, so
i see no fundamental reason why this couldn't be done better.

> See: http://www.kroah.com/log/linux/android-kernel-problems.html and various
> other rants about the evil terrible android forks, etc.
>
> So, we figure, let's sort out the hard problem first and then move on with
> our lives.

Well, my suggestion would be to first build up a path towards upstream, build
up trust, reduce your very high cross section to mainline - and do the most
difficult bits last.

Especially 'move on with our lives' suggests that you just want to get rid of
this ABI divergence and continue-as-usual with the pattern of non-cooperation,
hm?

> > There is absolutely _zero_ technical reason why the Android team should
> > present this as as an all-or-nothing effort. Why not merge hw drivers
> > first (with suspend blockers commented or stubbed out), to reduce the fork
> > distance?
>
> If that's the case then there is no problem and people could stop yelling at
> us and just submit their drivers. Awesome.
>
> I can't speak for all the nameless silicon vendors Greg represents, that we
> apparently are preventing from doing this (how? I don't know!), etc, but for
> my team maintaining multiple versions of drivers is a headache, we'd rather
> square away the wakelock debate first and figure something out there, as it
> just seems like a more logical approach. Maybe we're crazy.

It's not crazy, it's just IMHO inefficient and very difficult to do it like
that. And you arent the first one to try it like that (people _always_
gravitate towards coming with their most difficult patches first - because
they are very often the most useful patches) - it's a non-trivial learning
curve IMHO.

Ingo

2010-06-04 08:12:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Linus Torvalds <[email protected]> wrote:

> On Fri, 4 Jun 2010, Ingo Molnar wrote:
>
> > What you say is absolutely true, hence this would be driven via
> > sched_tick() + TIF notifiers - i.e. only ever treat user-mode tasks as
> > 'idle-able'. This can be done with no overhead to the regular fastpaths.
> >
> > The TIF notifier would be the one scheduling to idle - and would thus do
> > it only to user-mode tasks.
>
> The thing is, unless there is some _really_ deep other reason to do
> something like this, I still think it's total overdesign to push any
> knowledge/choices like this into the scheduler. I'd rather keep things way
> more independent, less tied to each other and to deep kernel subsystems.

Well, the deep reason as i see it is simply the observation that what the
Android auto-suspend code implements via the suspend-blocker patches is an
idle driver and user-space scheduler in disguise. (if you count that as a deep
enough reason)

I dont mind hacks if they are local and if i dont have to maintain them, but
the objection from other folks was that suspend blockers are not that local
and not that maintainable. And if (and that's a big if) we have a global
effect anyway, then we might as well consider implementing it cleanly:

- A global /sys flag is fundamentally racy and only allows a single
user-space actor. Not a problem on mobile phones but sure violates
taste buds.

Proper per task latency attributes are not racy - we always know the
maximum/minimum values, without user-space interfering with each other.

- When done correctly we might win a couple of new features as well around
the fringes:

- Useful for power savings on mobile: crappy apps can be idled on an
intermediate level, even before the system goes totally idle. There's no
equivalent suspend-blockers feature.

- Useful for real-time tasks that want to idle lower prio tasks when some
really important thing is running - even if the real-time task might sleep.
This is superior to the 'hog the CPU' kind of hacks that have been used
for this purpose before.

- The hacks needed to express a race-free suspend/wakeup cycle are unnatural
and stem from the model being a user-space driven idle manager instead of a
proper part of task sleep/wakeup.

- None of this code seems to impact any scheduler hotpath (most of it is just
a special form of idle driver) - it's all on deeper levels of idle and, at
most, in off-line return-to-userspace codepaths. So there's no strong
performance reason _against_ some level of integration. There is indeed
the coupling effect as you mention, which weighs against.

- i also think Andoid's auto-suspend is a strategic feature to Linux: i
think auto/opportunistic suspend will matter more and more, and my guess
is that ten years most of our daily systems will be doing auto-suspend and
will have proper wakeups from suspend implemented in hardware. Not just
phones and gadgets but also portable tablets, book readers, TVs - and i
wouldnt mind a non-portable, table sized tablet either ;-)

At which point i'd hate to have some hack of a solution ingrained and
ABI-ized with little chance to move user-space to sanity.

But yes, i definitely agree with you that it all comes down to 'do we care':

- If we care we should integrate it intelligently where it belongs
conceptually: the idle drivers and the scheduler.

- If we dont care then we should isolate the hacks as much as possible - and
then the current suspend blocker patch-set is definitely a good basis to
start. (with perhaps the /sys hackery cleaned up a bit, as you suggested)

I dont favor either of the solutions too deeply - so i personally have not
NAK-ed suspend blockers - i just saw a half a dozen semi-NAKs flying from
other folks, so tried to help come up with a palatable design.

_If_ most of x86 hardware was able to suspend race-free i think deeper
integration would be a slam-dunk - as we could make it work almost everywhere.
Sadly only a tiny subset of x86 qualifies, so the argument isnt obvious. Maybe
we should pick a variant of suspend blockers and re-examine things in a few
years? It being an ABI makes it difficult tho.

What i would personally find unacceptable is to have _neither_ solutions - and
the discussion was heading towards that stage really, with both sides digging
the trenches of non-cooperation. IMHO we just cannot afford to let this drop
on the floor as the feature is immensely useful to Android and thus to Linux
at large.

Anyway, i'm glad that it's up to you ;-)

Ingo

2010-06-04 08:16:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Arjan van de Ven <[email protected]> wrote:

> On Thu, 3 Jun 2010 19:26:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > If the system is idle (or almost idle) for long times, I would heartily
> > recommend actively shutting down unused cores. Some CPU's are hopefully
> > smart enough to not even need that kind of software management, but I
> > suspect even the really smart ones might be able to take advantage of the
> > kernel saying: "I'm shutting you down, you don't have to worry about
> > latency AT ALL, because I'm keeping another CPU active to do any real
> > work".
>
> sadly the reality is that "offline" is actually the same as "deepest C
> state". At best.
>
> As far as I can see, this is at least true for all Intel and AMD cpus.
>
> And because there's then no power saving (but a performance cost), it's
> actually a negative for battery life/total energy.
>
> (lots of experiments inside Intel seem to confirm that, it's not just
> theory)

Well, the scheme would only be useful if it's _NOT_ just a deep C4 state, but
something that prevents tasks from being woken to that CPU for a good period
of time. Hot-unplugging that CPU achieves that (the runqueues are pulled), so
i think in Linus's idea makes sense in principle.

[ Or have you done deep-idle experiments to that effect as well? ]

I suspect it all depends on the cost: and our current hot-unplug and
hot-replug code is all but cheap ...

Ingo

2010-06-04 08:30:10

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 12:57 AM, Ingo Molnar <[email protected]> wrote:
> * Brian Swetland <[email protected]> wrote:
>>
>> We started here because it's possibly the only api level change we have --
>> almost everything else is driver or subarch type work or controversial but
>> entirely self-contained (like the binder, which I would be shocked to see
>> ever hit mainline). [...]
>
> So why arent those bits mainline? It's a 1000 times easier to get drivers and
> small improvements and non-ABI changes upstream.
>
> After basically two years of growing your fork (and some attempts to get your
> drivers into drivers/staging/ - from where they have meanwhile dropped out
> again) you re-started with the worst possible thing to merge: a big and
> difficult kernel feature affecting many subsystems. Why?

Because a large number of our drivers depend on it.

> This is one of the fundamental problems here. People simply dont know you,
> because you have not worked with us much - and hence they dont trust you
> positively out of box - they are neutral at best.
>
> And believe me, it's hard enough to get difficult features upstream if people
> _do_ know you and when they positively _do_ trust you ... Arent you talking to
> Andrew Morton about how to do these things properly? This is kernel
> contribution 101 really.
>
>> [...]  Assertions have been made that because the "android kernel" (not a
>> term I like -- linux is linux, we have some assorted patches on top) [...]
>
> I've been tracking android-common and android-msm for a while and i have to
> say that it shows a very lackluster attitude towards upstream:
>
>  - The latest branches i can see are v2.6.32 based today. We are in the
>   v2.6.35 stabilization cycle and are developing v2.6.36. I.e. your upstream
>   base is about a year too old.

We have some branch naming confusion and work going on in
experimental, but our active work right now is against 2.6.34 and
2.6.35-rc. The tegra2 work has been very aggressively following
mainline (rebasing against 2.6.34rc as they were getting underway),
and we've been sending those patches out for review, in hopes of
getting that tree off on a better foot.

>
>  - The last commit is a couple of weeks old AFAICS.
>
>  - The diffstat of android-common/android-2.6.32 is:
>
>      890 files changed, 39962 insertions(+), 6286 deletions(-)
>
>   Those assorted patches have spread over nearly a thousand files. FYI, by
>   the looks of it you are facing an exponentially worsening maintenance
>   overhead curve here.
>
> Is there perhaps some other tree i should be following? I'm looking at:
>
>  [remote "android-msm"]
>         url = git://android.git.kernel.org/kernel/msm.git
>         fetch = +refs/heads/*:refs/remotes/android-msm/*
>  [remote "android-common"]
>         url = git://android.git.kernel.org/kernel/common.git
>         fetch = +refs/heads/*:refs/remotes/android-common/*
>
> Btw., the commits i've glanced at looked mostly clean and well structured, so
> i see no fundamental reason why this couldn't be done better.

I think the fundamental issue we keep bumping into is the turnaround
time on patch review / inclusion (again we're trying to get things
going much earlier on tegra2 to hopefully have less pain there). We
aim for kernel style compliance (though we're not perfect and we make
our share of mistakes), but previously when I tried sending mach-msm
stuff out, it seemed infeasible to send 30-60+ patches, so we'd start
with 5-10, feedback would trickle in over the course of a week, I'd
respin, etc. After a couple weeks some stuff would get picked up
toward a merge window but the rest would have to wait. And then we
hit crunch to ship, etc, and get behind.

Totally our fault that we're not just constantly pushing patches (and
we're trying to get a fulltime engineer or two just to work on
upstream related stuff), but we rapidly hit the point where what we're
sending up is a drop in the bucket compared to the work we're doing
and things keep diverging, etc.

I'm told this happens to everyone, is common, etc. We're (seriously)
a small team, trying to ship multiple products a year and keep our
head above water here, and unfortunately that means we keep tabling
these projects until we can find some cycles to give it another go and
the delta grows.

>> So, we figure, let's sort out the hard problem first and then move on with
>> our lives.
>
> Well, my suggestion would be to first build up a path towards upstream, build
> up trust, reduce your very high cross section to mainline - and do the most
> difficult bits last.

Having to maintain two versions of about half our driver code because
we depend on an ABI not in mainline is a significant factor for us --
it's difficult to have what's going upstream lag behind our active
work (basically we have to maintain two different trees -- one for
mainline one for ship) already, but having these codelines also be
different makes it worse for us.

> Especially 'move on with our lives' suggests that you just want to get rid of
> this ABI divergence and continue-as-usual with the pattern of non-cooperation,
> hm?

I'd like to make some forward progress either to get something
wakelock-ish in and shift to whatever that api is, or to get a clear
"no not going to happen" and deal with the fallout there.

...

Sadly, for mach-msm, we're now further out due to maintainership
shifts (Daniel stepped up to do msm stuff, is pushing up some hybrid
of our work and Qualcomm's work that doesn't seem to really fit with
either, and I have no idea how to sanely get our stuff to sit on top
of that). I'd love to find some time to sit down, clean up the whole
msm tree for 8x50/7x30 which is (largely) pretty clean, and is
extremely stable and shippable, and try to get it into a patch series
and headed upstream, but we're now colliding with the upstream
mach-msm which has gone off in a different direction, etc.

Anyway, we continue to try to figure out how to make stuff work better
(again, trying some different approaches with tegra2), but so far the
process of getting code upstream has been extremely time intensive and
rather frustrating and it remains unclear who can sign off on what and
how many hoops different people will keep asking us to jump through.

Brian

2010-06-04 08:35:21

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Arve Hj?nnev?g <[email protected]> wrote:

> > [...]
> >
> > Why do you need to track input wakeups? It's rather fragile and rather
> > unnecessary [...]
>
> Because we have keys that should always turn the screen on, but the problem
> is not specific to input events. If we enabled a wakeup event it usually
> means we need this event to always work, not just when the system is fully
> awake or fully suspended.

Hm, i cannot follow that generic claim. Could you please point out the problem
to me via a specific example? Which task does what, what undesirable thing
happens where, etc.

Thanks,

Ingo

2010-06-04 08:55:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Brian Swetland <[email protected]> wrote:

> On Fri, Jun 4, 2010 at 12:57 AM, Ingo Molnar <[email protected]> wrote:
> > * Brian Swetland <[email protected]> wrote:
> >>
> >> We started here because it's possibly the only api level change we have
> >> -- almost everything else is driver or subarch type work or controversial
> >> but entirely self-contained (like the binder, which I would be shocked to
> >> see ever hit mainline). [...]
> >
> > So why arent those bits mainline? It's a 1000 times easier to get drivers
> > and small improvements and non-ABI changes upstream.
> >
> > After basically two years of growing your fork (and some attempts to get
> > your drivers into drivers/staging/ - from where they have meanwhile
> > dropped out again) you re-started with the worst possible thing to merge:
> > a big and difficult kernel feature affecting many subsystems. Why?
>
> Because a large number of our drivers depend on it.

So why not put in some stub or so? Auto-suspend/suspend-blockers is a feature,
and drivers ought to be able to work without a feature as well. Keep the
suspend-blocker changes in the android tree initially, and get the main body
of changes out first, and establish a flow of timely changes. That reduces
your maintenance burden and increases trust for future changes - a win-win
situation.

In any case, this is not to suggest that the suspend-blocker bits are
'impossible' to merge. I just say that if you start with your most difficult
feature you should not be surprised to be on the receiving end of a 1000+
mails flamewar on lkml ;-)

> > I've been tracking android-common and android-msm for a while and i have
> > to say that it shows a very lackluster attitude towards upstream:
> >
> > ??- The latest branches i can see are v2.6.32 based today. We are in the
> > ?? v2.6.35 stabilization cycle and are developing v2.6.36. I.e. your
> > upstream ?? base is about a year too old.
>
> We have some branch naming confusion and work going on in
> experimental, but our active work right now is against 2.6.34 and
> 2.6.35-rc. [...]

That's nice!

> [...] The tegra2 work has been very aggressively following mainline
> (rebasing against 2.6.34rc as they were getting underway), and we've been
> sending those patches out for review, in hopes of getting that tree off on a
> better foot.

Ah, googling for 'tegra2' gave me the magic URI:

git remote add android-tegra2 git://android.git.kernel.org/kernel/tegra.git

I generally roam various trees for scheduler patches when i can, seeing what
problems people are facing and trying to prevent more painful forks from
developing. You have these changes there currently:

d82647e: sched: make task dump print all 15 chars of proc comm
5e3e0f1: sched: Enable might_sleep before initializing drivers.

Please submit 5e3e0f1. We can probably do that one even simpler, by turning
__might_sleep_init_called into the only flag that __might_sleep() checks -
i.e. not checking system_state at all.

Also, please submit d82647e, it makes sense too.

Thanks,

Ingo

2010-06-04 08:56:35

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
>
> * Arve Hj?nnev?g <[email protected]> wrote:
>
>> > [...]
>> >
>> > Why do you need to track input wakeups? It's rather fragile and rather
>> > unnecessary [...]
>>
>> Because we have keys that should always turn the screen on, but the problem
>> is not specific to input events. If we enabled a wakeup event it usually
>> means we need this event to always work, not just when the system is fully
>> awake or fully suspended.
>
> Hm, i cannot follow that generic claim. Could you please point out the problem
> to me via a specific example? Which task does what, what undesirable thing
> happens where, etc.
>

We have many wakeup events, and some of them are invisible to the
user. For instance on the Nexus One wake up every 10 minutes monitor
the battery health. If the user presses a key right after this work
has finished and we did not block suspend until userspace could
process this key event, we risk suspending before we could turn the
screen on, which to the user looks like the key did not work. Another
example, the user pressed the power key which turns the screen off and
allows suspend. We initiate suspend and a phone call comes in. If we
don't block suspend until we processed the incoming phone call
notification, the phone may never ring (some devices will send a new
message every few seconds for this, so on those devices it would just
delay the ringing).

--
Arve Hj?nnev?g

2010-06-04 09:03:15

by Pekka Enberg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 11:55 AM, Ingo Molnar <[email protected]> wrote:
> In any case, this is not to suggest that the suspend-blocker bits are
> 'impossible' to merge. I just say that if you start with your most difficult
> feature you should not be surprised to be on the receiving end of a 1000+
> mails flamewar on lkml ;-)

Indeed. This 'all or nothing' approach hasn't worked well in the past
and I highly doubt it will work now. It's much easier to work with
people when you have a track record of getting things merged and
actually maintaining the code.

Pekka

2010-06-04 09:08:11

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 1:55 AM, Ingo Molnar <[email protected]> wrote:
> * Brian Swetland <[email protected]> wrote:
>> > After basically two years of growing your fork (and some attempts to get
>> > your drivers into drivers/staging/ - from where they have meanwhile
>> > dropped out again) you re-started with the worst possible thing to merge:
>> > a big and difficult kernel feature affecting many subsystems. Why?
>>
>> Because a large number of our drivers depend on it.
>
> So why not put in some stub or so? Auto-suspend/suspend-blockers is a feature,
> and drivers ought to be able to work without a feature as well. Keep the
> suspend-blocker changes in the android tree initially, and get the main body
> of changes out first, and establish a flow of timely changes. That reduces
> your maintenance burden and increases trust for future changes - a win-win
> situation.

The impression I got from previous discussions was that upstream did
not want things that were built conditionally around APIs that did not
exist in mainline nor stub implementations for things that were not
agreed upon.

We could easily either #if defined(CONFIG_SUSPEND_BLOCKERS) or submit
a suspend_blockers.h that just makes everything a no-op, if that's an
acceptable transition vehicle. I didn't think either were an option
open to us.

> In any case, this is not to suggest that the suspend-blocker bits are
> 'impossible' to merge. I just say that if you start with your most difficult
> feature you should not be surprised to be on the receiving end of a 1000+
> mails flamewar on lkml ;-)

Yeah, I do understand that we're not making it easy for ourselves
here. I think we hit the point where Rafael and Matthew signed off on
things and thought "aha, linux-pm maintainers are happy, now we're
getting somewhere" only to realize the light at the end of the tunnel
was a bit further out than we anticipated ^^

Brian

2010-06-04 09:44:23

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 01:23 +0200, Ingo Molnar wrote:
> Btw., i'd like to summarize the scheduler based suspend scheme proposed by
> Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in
> the big thread, and there are also new elements of the proposal:

Just to clarify, my proposition doesn't go much further than treating
'suspend' as a genuine idle state (on suitable hardware, which x86 isn't).

> - Create a 'deep idle' mode that suspends. This, if all constraints
> are met, is triggered by the scheduler automatically: just like the other
> idle modes are triggered currently. This approach fixes the wakeup
> races because an incoming wakeup event will set need_resched() and
> abort the suspend.
>

Right, so 'suspend' as idle seems (at least on UP/arm) a very sensible
idea. On SMP current suspend hot-unplugs all but the boot cpu, I'm not
sure we need to do that, since if the system is genuinely idle, what races
are there?

And if its not idle...

> ( This mode can even use the existing suspend code to bring stuff down,
> therefore it also solves the pending timer problem and works even on
> PC style x86. )

You cannot solve the pending timer issue from idle, unless you allow idle
to stop clock_monotonic, which would change idle semantics, and that is not
something I can say is a good idea.

You want all idle states to have the same semantics, otherwise things just
get way too confusing.

> - Solve crappy app confinement via the scheduler:
>
> A first proposal was to use the existing cgroup mechanism,

I still believe containment is a cgroup problem. The freeze/snapshot/resume
container folks seem to face many of the same problems. Including the
pending timer one I suspect. Lets solve it there.

> - Controlled auto-suspend: drivers (such as input) could on wakeup
> automatically set the 'minimum wakeup latency' value of wakee tasks to a
> lower value. This automatically prevents another auto-suspend in the near
> future: up to the point the wakee task increases its latency (via the
> scheduler syscall) again and allows suspend again.

I think treating wakeups special like that is a mistake. I also think the
kernel should never adjust a task's QoS attributes, the user set them in
the expectation of them being respected.

I'm not really sure about the interaction between wakeups and untrusted
apps. It seems to me that an untrusted app needs a trusted intermediate
anyway, that intermediate can be responsible for freezing/unfreezing of the
untrusted app.

So either the app asks for suspend blockers through the intermediate, or it's
cgroup is managed by the intermediate -- should work out to the same end
result, right?

2010-06-04 09:54:19

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 11:43 +0200, Peter Zijlstra wrote:
> I still believe containment is a cgroup problem. The freeze/snapshot/resume
> container folks seem to face many of the same problems. Including the
> pending timer one I suspect. Lets solve it there.

While talking to Thomas about this, we'd probably need a CLOCK_MONOTONIC
namespace to pull this off, so that resumed apps don't see the jump in
absolute time.

This would also help with locating the relevant timers, since they'd be
on the related timer base.

The only 'interesting' issue I can see here is that if you create 1000
CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
efficiently find the leftmost timer.

2010-06-04 09:59:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Brian Swetland <[email protected]> wrote:

> On Fri, Jun 4, 2010 at 1:55 AM, Ingo Molnar <[email protected]> wrote:
> > * Brian Swetland <[email protected]> wrote:
> >> > After basically two years of growing your fork (and some attempts to get
> >> > your drivers into drivers/staging/ - from where they have meanwhile
> >> > dropped out again) you re-started with the worst possible thing to merge:
> >> > a big and difficult kernel feature affecting many subsystems. Why?
> >>
> >> Because a large number of our drivers depend on it.
> >
> > So why not put in some stub or so? Auto-suspend/suspend-blockers is a
> > feature, and drivers ought to be able to work without a feature as well.
> > Keep the suspend-blocker changes in the android tree initially, and get
> > the main body of changes out first, and establish a flow of timely
> > changes. That reduces your maintenance burden and increases trust for
> > future changes - a win-win situation.
>
> The impression I got from previous discussions was that upstream did not
> want things that were built conditionally around APIs that did not exist in
> mainline nor stub implementations for things that were not agreed upon.

Well, if it's some ugly #ifdef solution i could imagine light objections on
pure aesthetic micro-grounds.

> We could easily either #if defined(CONFIG_SUSPEND_BLOCKERS) or submit a
> suspend_blockers.h that just makes everything a no-op, if that's an
> acceptable transition vehicle. I didn't think either were an option open to
> us.

You can certainly put in a suspend_blockers.h thing into some Android
directory, and populate it with empty wrappers - as long as you only use it
within Android drivers and not core kernel code or other subsystems you dont
maintain.

It's being done all the time and helpful cleanup patches eliminating the stubs
are frowned upon (unless the subs are there like for years with no progress
and no maintenance in sight).

Putting empty stubs into include/linux/ would be pushing things i think.

In fact sometimes architectures even jump the gun with major kernel features:
we had a dynticks implementation in ARM for years, we had RTLinux stubs in x86
code for quite some time, and we still have perfmon in IA64 - despite the core
kernel having gone for a different design.

It's certainly not ideal, but it's certainly a solution that is used every now
and then. The less difference there is between trees the easier it becomes to
merge - for both sides, both technically and socially.

> > In any case, this is not to suggest that the suspend-blocker bits are
> > 'impossible' to merge. I just say that if you start with your most
> > difficult feature you should not be surprised to be on the receiving end
> > of a 1000+ mails flamewar on lkml ;-)
>
> Yeah, I do understand that we're not making it easy for ourselves here. I
> think we hit the point where Rafael and Matthew signed off on things and
> thought "aha, linux-pm maintainers are happy, now we're getting somewhere"
> only to realize the light at the end of the tunnel was a bit further out
> than we anticipated ^^

That's a well-known problem on lkml: the light at the end of the tunnel was
the other train ;-)

Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
crystalising out today. Everyone seems to agree now that the main usecases are
indeed useful and need handling one way or another - the rest is really just
technological discussions how to achieve the mostly-agreed-upon end goal.

The worst situation are features where one side says 'we dont need this kind
of functionality at all' - IMO auto/opportunistic-suspend isnt in that
situation, fortunately.

Thanks,

Ingo

2010-06-04 10:04:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: suspend blockers & Android integration


* Peter Zijlstra <[email protected]> wrote:

> On Fri, 2010-06-04 at 11:43 +0200, Peter Zijlstra wrote:
> > I still believe containment is a cgroup problem. The freeze/snapshot/resume
> > container folks seem to face many of the same problems. Including the
> > pending timer one I suspect. Lets solve it there.
>
> While talking to Thomas about this, we'd probably need a CLOCK_MONOTONIC
> namespace to pull this off, so that resumed apps don't see the jump in
> absolute time.
>
> This would also help with locating the relevant timers, since they'd be on
> the related timer base.

Ok - this looks workable, and looks technically isolated that can be pursued
as a separate module of this whole topic.

> The only 'interesting' issue I can see here is that if you create 1000
> CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
> efficiently find the leftmost timer.

Realistically Android userspace would create just a single such namespace for
all the untrusted/unknown/uncontrolled apps, right?

Ingo

2010-06-04 10:08:51

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 12:03 +0200, Ingo Molnar wrote:

> > The only 'interesting' issue I can see here is that if you create 1000
> > CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
> > efficiently find the leftmost timer.
>
> Realistically Android userspace would create just a single such namespace for
> all the untrusted/unknown/uncontrolled apps, right?

Possibly, yeah.

But it might not stop someone else from create an insane amount of them.
So we do need to deal with that, and a linear loop over all timer bases,
which then will be a user controlled quantity, just doesn't sound
right :-)

2010-06-04 10:09:45

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 2:59 AM, Ingo Molnar <[email protected]> wrote:
>
> You can certainly put in a suspend_blockers.h thing into some Android
> directory, and populate it with empty wrappers - as long as you only use it
> within Android drivers and not core kernel code or other subsystems you dont
> maintain.
>
> It's being done all the time and helpful cleanup patches eliminating the stubs
> are frowned upon (unless the subs are there like for years with no progress
> and no maintenance in sight).
>
> Putting empty stubs into include/linux/ would be pushing things i think.
>
> In fact sometimes architectures even jump the gun with major kernel features:
> we had a dynticks implementation in ARM for years, we had RTLinux stubs in x86
> code for quite some time, and we still have perfmon in IA64 - despite the core
> kernel having gone for a different design.
>
> It's certainly not ideal, but it's certainly a solution that is used every now
> and then. The less difference there is between trees the easier it becomes to
> merge - for both sides, both technically and socially.

Totally -- our goal would be that as drivers find their way from our
tree to mainline we'd keep them 1:1 between the trees. If we can it a
local suspend_blocker.h somewhere while the long term solution gets
hashed out that'd remove the biggest painpoint on a driver level. I'm
not quite sure where the best place to drop such a thing would be --
we'd likely be including it from mach-msm, mach-tegra2, and drivers
for both those architectures in the normal driver places for the tree.
I guess we could just drop it in
arch/arm/mach-{msm,tegra2}/include/mach/ and both the subarch code and
subarch-specific-drivers we've been writing could pick it up via
#include <mach/suspend_blockers.h>

>> Yeah, I do understand that we're not making it easy for ourselves here.  I
>> think we hit the point where Rafael and Matthew signed off on things and
>> thought "aha, linux-pm maintainers are happy, now we're getting somewhere"
>> only to realize the light at the end of the tunnel was a bit further out
>> than we anticipated ^^
>
> That's a well-known problem on lkml: the light at the end of the tunnel was
> the other train ;-)
>
> Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
> crystalising out today. Everyone seems to agree now that the main usecases are
> indeed useful and need handling one way or another - the rest is really just
> technological discussions how to achieve the mostly-agreed-upon end goal.
>
> The worst situation are features where one side says 'we dont need this kind
> of functionality at all' - IMO auto/opportunistic-suspend isnt in that
> situation, fortunately.

It is encouraging that there's at least some general consensus that
the feature is useful, and as Arve and I have both mentioned, we're
really not religious about names, etc, provided we can solve the
problem we're trying to solve, so if it ends up being qos constraints
or something else entirely but still gets us where we're trying to go,
it's good news.

I think one point of contention remaining may be "just blocking
suspend" vs "halting specific untrusted processes". The latter is
difficult for us to work with because of the overall complexity of
(our) userspace environment. A big hammer where we stop it all and
suspend ends up being less deadlock/inversion-prone. Of course if the
general solution ends up being able to do either, then perhaps
everyone's happy.

Brian

2010-06-04 10:11:17

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 3:08 AM, Peter Zijlstra <[email protected]> wrote:
> On Fri, 2010-06-04 at 12:03 +0200, Ingo Molnar wrote:
>
>> > The only 'interesting' issue I can see here is that if you create 1000
>> > CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
>> > efficiently find the leftmost timer.
>>
>> Realistically Android userspace would create just a single such namespace for
>> all the untrusted/unknown/uncontrolled apps, right?
>
> Possibly, yeah.

Definitely, at least initially. If we had the ability to do more fine
grained control, we might be tempted to experiment with it, but to
start with we'd feel safer using the big hammer as we are today.

Brian

2010-06-04 10:13:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 4 Jun 2010, Peter Zijlstra wrote:

> On Fri, 2010-06-04 at 11:43 +0200, Peter Zijlstra wrote:
> > I still believe containment is a cgroup problem. The freeze/snapshot/resume
> > container folks seem to face many of the same problems. Including the
> > pending timer one I suspect. Lets solve it there.
>
> While talking to Thomas about this, we'd probably need a CLOCK_MONOTONIC
> namespace to pull this off, so that resumed apps don't see the jump in
> absolute time.
>
> This would also help with locating the relevant timers, since they'd be
> on the related timer base.
>
> The only 'interesting' issue I can see here is that if you create 1000
> CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
> efficiently find the leftmost timer.

We can do more clever than that. All CLOCK_MONOTONIC timers can live
in the CLOCK_MONOTONIC rbtree, we just need proper annotation, i.e.:

struct hrtimer {
ktime_t expires;
......
struct list_head namespace;
ktime_t base_offset;
};

So expires would be on CLOCK_MONOTONIC as seen from the kernel, just
the user space interfaces would take the base_offset into account.

On freeze we remove the timers from the rbtree (they are easy to
find via the namespace list) and on thaw we set the base_offset
accordingly and insert them again. So no surprise for user space and
no tree of trees to walk through.

Thanks,

tglx

2010-06-04 10:13:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 12:11 +0200, Thomas Gleixner wrote:
> On Fri, 4 Jun 2010, Peter Zijlstra wrote:
>
> > On Fri, 2010-06-04 at 11:43 +0200, Peter Zijlstra wrote:
> > > I still believe containment is a cgroup problem. The freeze/snapshot/resume
> > > container folks seem to face many of the same problems. Including the
> > > pending timer one I suspect. Lets solve it there.
> >
> > While talking to Thomas about this, we'd probably need a CLOCK_MONOTONIC
> > namespace to pull this off, so that resumed apps don't see the jump in
> > absolute time.
> >
> > This would also help with locating the relevant timers, since they'd be
> > on the related timer base.
> >
> > The only 'interesting' issue I can see here is that if you create 1000
> > CLOCK_MONOTONIC namepaces, we'd need to have a tree of trees in order to
> > efficiently find the leftmost timer.
>
> We can do more clever than that. All CLOCK_MONOTONIC timers can live
> in the CLOCK_MONOTONIC rbtree, we just need proper annotation, i.e.:
>
> struct hrtimer {
> ktime_t expires;
> ......
> struct list_head namespace;
> ktime_t base_offset;
> };
>
> So expires would be on CLOCK_MONOTONIC as seen from the kernel, just
> the user space interfaces would take the base_offset into account.
>
> On freeze we remove the timers from the rbtree (they are easy to
> find via the namespace list) and on thaw we set the base_offset
> accordingly and insert them again. So no surprise for user space and
> no tree of trees to walk through.

Ah indeed, much nicer.

2010-06-04 10:42:31

by Andi Kleen

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

Linus Torvalds <[email protected]> writes:

> On Thu, 3 Jun 2010, Arjan van de Ven wrote:
>>
>> And because there's then no power saving (but a performance cost), it's
>> actually a negative for battery life/total energy.
>
> Including the UP optimizations we do (ie lock prefix removal)? It's

Those only help the kernel and most workloads do not do enough kernel
execution for it to really matter, but spend most of their
time in user space.

Even if as kernel programmers we often have a different view, in most
cases most cycles are in user space :)

-Andi
--
[email protected] -- Speaking for myself only.

2010-06-04 12:07:22

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 01:56 -0700, Arve Hjønnevåg wrote:
> On Fri, Jun 4, 2010 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
> >
> > * Arve Hj?nnev?g <[email protected]> wrote:
> >
> >> > [...]
> >> >
> >> > Why do you need to track input wakeups? It's rather fragile and rather
> >> > unnecessary [...]
> >>
> >> Because we have keys that should always turn the screen on, but the problem
> >> is not specific to input events. If we enabled a wakeup event it usually
> >> means we need this event to always work, not just when the system is fully
> >> awake or fully suspended.
> >
> > Hm, i cannot follow that generic claim. Could you please point out the problem
> > to me via a specific example? Which task does what, what undesirable thing
> > happens where, etc.
> >
>
> We have many wakeup events, and some of them are invisible to the
> user. For instance on the Nexus One wake up every 10 minutes monitor
> the battery health.

> If the user presses a key right after this work
> has finished and we did not block suspend until userspace could
> process this key event, we risk suspending before we could turn the
> screen on, which to the user looks like the key did not work.

> Another
> example, the user pressed the power key which turns the screen off and
> allows suspend. We initiate suspend and a phone call comes in. If we
> don't block suspend until we processed the incoming phone call
> notification, the phone may never ring (some devices will send a new
> message every few seconds for this, so on those devices it would just
> delay the ringing).

Right, so in the proposed scheme all these tasks would be executed by
trusted processes, and trusted processes will never get frozen and so
will never be delayed in processing these events.

Only untrusted code will be frozen. And trusted processes are reliable
for thawing the untrusted processes and delivering events to it.

Trusted processes are assumed to be sane and idle when there is nothing
for them to do, allowing the machine to go into deep idle states.

2010-06-04 14:24:18

by James Bottomley

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 11:59 +0200, Ingo Molnar wrote:
> * Brian Swetland <[email protected]> wrote:
> > On Fri, Jun 4, 2010 at 1:55 AM, Ingo Molnar <[email protected]> wrote:
> > > * Brian Swetland <[email protected]> wrote:
[...]
> > > In any case, this is not to suggest that the suspend-blocker bits are
> > > 'impossible' to merge. I just say that if you start with your most
> > > difficult feature you should not be surprised to be on the receiving end
> > > of a 1000+ mails flamewar on lkml ;-)
> >
> > Yeah, I do understand that we're not making it easy for ourselves here. I
> > think we hit the point where Rafael and Matthew signed off on things and
> > thought "aha, linux-pm maintainers are happy, now we're getting somewhere"
> > only to realize the light at the end of the tunnel was a bit further out
> > than we anticipated ^^
>
> That's a well-known problem on lkml: the light at the end of the tunnel was
> the other train ;-)
>
> Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
> crystalising out today. Everyone seems to agree now that the main usecases are
> indeed useful and need handling one way or another - the rest is really just
> technological discussions how to achieve the mostly-agreed-upon end goal.

It's still not clear to me whether everyone's revolving around to using
the current suspend block API because it's orthogonal to all other
mechanisms and is therefore separate from the kernel (and can be
compiled out if you don't want it). Or whether re-expressing what the
android drivers want (minimum idle states and suspend block) in pm_qos
terms which others can use is the way to go. I think the latter, but
I'd like to know what other people think (because I'm not wedded to this
preference).

> The worst situation are features where one side says 'we dont need this kind
> of functionality at all' - IMO auto/opportunistic-suspend isnt in that
> situation, fortunately.

Great ... because deprecating the problem has been one of the persistent
memes by some people on this huge thread.

James

2010-06-04 14:50:28

by Alan Stern

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 4 Jun 2010, Ingo Molnar wrote:

> Note that this does not necessarily have to be implemented as 'execute suspend
> from the idle task' code: scheduling from the idle task, while can certainly
> be made to work, is a somewhat recursive concept that we might want to avoid
> for robustness reasons.
>
> Instead, the 'deepest idle' (suspend) method could consist of a wakeup of a
> kernel thread (or of any of the existing kernel threads such as the migration
> thread) - which kernel thread then does a race-free suspend: it offlines all
> but one CPU [on platforms that need that] and then initiates the suspend - but
> aborts the attempt if there's any sign of wakeup activity.

Out of morbid curiosity... A typical sign of wakeup activity is a
thread becoming runnable because of expiration of a kernel timer or an
I/O completion interrupt. How would the "race-free suspend" thread
detect this sort of thing? Indeed, isn't the inability to detect these
part of what makes the existing suspend implementation (the freezer in
particular) not race-free?

Alan Stern

2010-06-04 15:07:56

by Florian Mickler

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 04 Jun 2010 09:24:06 -0500
James Bottomley <[email protected]> wrote:

> On Fri, 2010-06-04 at 11:59 +0200, Ingo Molnar wrote:
> > Anyway, i'm not pessimistic at all: _some_ sort of scheme appears to be
> > crystalising out today. Everyone seems to agree now that the main usecases are
> > indeed useful and need handling one way or another - the rest is really just
> > technological discussions how to achieve the mostly-agreed-upon end goal.
>
> It's still not clear to me whether everyone's revolving around to using
> the current suspend block API because it's orthogonal to all other
> mechanisms and is therefore separate from the kernel (and can be
> compiled out if you don't want it). Or whether re-expressing what the
> android drivers want (minimum idle states and suspend block) in pm_qos
> terms which others can use is the way to go. I think the latter, but
> I'd like to know what other people think (because I'm not wedded to this
> preference).

I'd like to know that also.
I have a patch to add?pm_qos_add_request_nonblock function, so it is
possible to register an pm_qos constraint by passing preallocated
memory to it.

Notifying should be possible to do from atomic contexts via
async_schedule()?

The scalability issues of pm_qos can be adressed by using plists for
all pm_qos_class'es. Or by having the different pm_qos_class'es provide
their own implementations for the update and get operations.

Cheers,
Flo

>
> James
>

2010-06-04 23:37:29

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Friday 04 June 2010, Peter Zijlstra wrote:
> On Fri, 2010-06-04 at 01:23 +0200, Ingo Molnar wrote:
> > Btw., i'd like to summarize the scheduler based suspend scheme proposed by
> > Thomas Gleixner, Peter Zijlstra and myself. I found no good summary of it in
> > the big thread, and there are also new elements of the proposal:
>
> Just to clarify, my proposition doesn't go much further than treating
> 'suspend' as a genuine idle state (on suitable hardware, which x86 isn't).
>
> > - Create a 'deep idle' mode that suspends. This, if all constraints
> > are met, is triggered by the scheduler automatically: just like the other
> > idle modes are triggered currently. This approach fixes the wakeup
> > races because an incoming wakeup event will set need_resched() and
> > abort the suspend.
> >
>
> Right, so 'suspend' as idle seems (at least on UP/arm) a very sensible
> idea. On SMP current suspend hot-unplugs all but the boot cpu, I'm not
> sure we need to do that, since if the system is genuinely idle, what races
> are there?
>
> And if its not idle...
>
> > ( This mode can even use the existing suspend code to bring stuff down,
> > therefore it also solves the pending timer problem and works even on
> > PC style x86. )
>
> You cannot solve the pending timer issue from idle, unless you allow idle
> to stop clock_monotonic, which would change idle semantics, and that is not
> something I can say is a good idea.
>
> You want all idle states to have the same semantics, otherwise things just
> get way too confusing.
>
> > - Solve crappy app confinement via the scheduler:
> >
> > A first proposal was to use the existing cgroup mechanism,
>
> I still believe containment is a cgroup problem.

I kind of agree here, so I'd like to focus a bit on that.

Here's my idea in the very general terms:

(1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
that don't use suspend blockers aka wakelocks in the Android world) at the
point Android would normally start opportunistic suspend.

(2) Allow the cpuidle framework to put CPUs into low-power states after the
"trusted" apps (ie. the ones that use suspend blockers in the Android
world) have gone idle.

(3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
before idling the last CPU (*).

(4) Design a mechanism to resume the I/O devices suspended in (3) so that
they are not powered up unnecessarily (that's going to be difficult as far
as I can see).

This way, in principle, we should be able to save (at least almost) as much
energy as the opportunistic suspend currently used by Android, provided that
things will be capable of staying idle for extended periods of time.

(*) That may require per-device PM QoS requirements to be used, in which case
devices may even be suspended earlier if the PM QoS requirements of all
of their users are met.

I wonder what people think. Is this realistic and if so, would it be difficult
to implement?

Rafael

2010-06-05 00:07:27

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> I kind of agree here, so I'd like to focus a bit on that.
>
> Here's my idea in the very general terms:
>
> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> that don't use suspend blockers aka wakelocks in the Android world) at the
> point Android would normally start opportunistic suspend.

There is an additional benefit to this approach:

In the current android world a background task (e.g. download
initiated before the screensaver kicked in) prevents the suspend,
but that also means that the crapplications can still suck power
completely unconfined.

With the cgroup freezer you can "suspend" them right away and
just keep the trusted background task(s) alive which allows us to
go into deeper idle states instead of letting the crapplications
run unconfined until the download finished and the suspend
blocker goes away.

> (2) Allow the cpuidle framework to put CPUs into low-power states after the
> "trusted" apps (ie. the ones that use suspend blockers in the Android
> world) have gone idle.
>
> (3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
> before idling the last CPU (*).
>
> (4) Design a mechanism to resume the I/O devices suspended in (3) so that
> they are not powered up unnecessarily (that's going to be difficult as far
> as I can see).
>
> This way, in principle, we should be able to save (at least almost) as much
> energy as the opportunistic suspend currently used by Android, provided that
> things will be capable of staying idle for extended periods of time.
>
> (*) That may require per-device PM QoS requirements to be used, in which case
> devices may even be suspended earlier if the PM QoS requirements of all
> of their users are met.
>
> I wonder what people think. Is this realistic and if so, would it be difficult
> to implement?

I think it's realistic and not overly complicated to implement.

Thanks,

tglx

2010-06-05 00:10:50

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/4 Peter Zijlstra <[email protected]>:
> On Fri, 2010-06-04 at 01:56 -0700, Arve Hj?nnev?g wrote:
>> On Fri, Jun 4, 2010 at 1:34 AM, Ingo Molnar <[email protected]> wrote:
>> >
>> > * Arve Hj?nnev?g <[email protected]> wrote:
>> >
>> >> > [...]
>> >> >
>> >> > Why do you need to track input wakeups? It's rather fragile and rather
>> >> > unnecessary [...]
>> >>
>> >> Because we have keys that should always turn the screen on, but the problem
>> >> is not specific to input events. If we enabled a wakeup event it usually
>> >> means we need this event to always work, not just when the system is fully
>> >> awake or fully suspended.
>> >
>> > Hm, i cannot follow that generic claim. Could you please point out the problem
>> > to me via a specific example? Which task does what, what undesirable thing
>> > happens where, etc.
>> >
>>
>> We have many wakeup events, and some of them are invisible to the
>> user. For instance on the Nexus One wake up every 10 minutes monitor
>> the battery health.
>
>> If the user presses a key right after this work
>> has finished and we did not block suspend until userspace could
>> process this key event, we risk suspending before we could turn the
>> screen on, which to the user looks like the key did not work.
>
>> Another
>> example, the user pressed the power key which turns the screen off and
>> allows suspend. We initiate suspend and a phone call comes in. If we
>> don't block suspend until we processed the incoming phone call
>> notification, the phone may never ring (some devices will send a new
>> message every few seconds for this, so on those devices it would just
>> delay the ringing).
>
> Right, so in the proposed scheme all these tasks would be executed by
> trusted processes, and trusted processes will never get frozen and so
> will never be delayed in processing these events.
>

There are many proposes schemes. I assume you mean freezing only
untrusted processes and nothing else.

> Only untrusted code will be frozen. And trusted processes are reliable
> for thawing the untrusted processes and delivering events to it.
>

I have two problems with this. I don't want to funnel all events
trough trusted processes, and I also want to freeze trusted processes.

> Trusted processes are assumed to be sane and idle when there is nothing
> for them to do, allowing the machine to go into deep idle states.
>

Neither the kernel nor our trusted user-space code currently meets
this criteria.

--
Arve Hj?nnev?g

2010-06-05 00:39:19

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> I kind of agree here, so I'd like to focus a bit on that.
>>
>> Here's my idea in the very general terms:
>>
>> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
>> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
>> ? ? point Android would normally start opportunistic suspend.
>
> There is an additional benefit to this approach:
>
> ? ? In the current android world a background task (e.g. download
> ? ? initiated before the screensaver kicked in) prevents the suspend,
> ? ? but that also means that the crapplications can still suck power
> ? ? completely unconfined.
>

Yes this can happen. It is usually only a big problem when you combine
an (trusted) application that has a bug that blocks suspend forever
with an application that wakes up too often for us to enter low power
idle modes.

> ? ? With the cgroup freezer you can "suspend" them right away and
> ? ? just keep the trusted background task(s) alive which allows us to
> ? ? go into deeper idle states instead of letting the crapplications
> ? ? run unconfined until the download finished and the suspend
> ? ? blocker goes away.
>

Yes this would be better, but I want it in addition to suspend, not
instead of it. It is also unclear if our user-space code could easily
make use of it since our trusted code calls into untrusted code.

>> (2) Allow the cpuidle framework to put CPUs into low-power states after the
>> ? ? "trusted" apps (ie. the ones that use suspend blockers in the Android
>> ? ? world) have gone idle.
>>

As far as I know this is what we already have on hardware that supports it.

>> (3) Teach the cpuidle framework to schedule runtime suspend of I/O devices
>> ? ? before idling the last CPU (*).
>>

I don't think we need this for android phones. We already put I/O
devices in low power modes when they are not in use.

>> (4) Design a mechanism to resume the I/O devices suspended in (3) so that
>> ? ? they are not powered up unnecessarily (that's going to be difficult as far
>> ? ? as I can see).
>>
>> This way, in principle, we should be able to save (at least almost) as much
>> energy as the opportunistic suspend currently used by Android, provided that
>> things will be capable of staying idle for extended periods of time.

The main reason we use suspend is that the system does not stay idle
for extened periods of time. If this gets fixed, and our if user-space
framework can deal with a subset of processes being frozen (this is a
big if) this solution may work, but it does not help us today.

>>
>> (*) That may require per-device PM QoS requirements to be used, in which case
>> ? ? devices may even be suspended earlier if the PM QoS requirements of all
>> ? ? of their users are met.
>>
>> I wonder what people think. ?Is this realistic and if so, would it be difficult
>> to implement?
>
> I think it's realistic and not overly complicated to implement.
>

The kernel support can be easily implemented on most arm hardware, I
don't know if it can work on most existing x86 hardware. It does not
give us the same power savings as suspend with existing software, but
it can handle bad apps better (assuming you don't combine
opportunistic suspend and cgroup freezing). The biggest hurdle is how
to handle dependencies between processes that gets frozen and
processes that don't get frozen.

--
Arve Hj?nnev?g

2010-06-05 01:18:33

by Matt Helsley

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:

<snip>

>
> > ? ? With the cgroup freezer you can "suspend" them right away and
> > ? ? just keep the trusted background task(s) alive which allows us to
> > ? ? go into deeper idle states instead of letting the crapplications
> > ? ? run unconfined until the download finished and the suspend
> > ? ? blocker goes away.
> >
>
> Yes this would be better, but I want it in addition to suspend, not
> instead of it. It is also unclear if our user-space code could easily
> make use of it since our trusted code calls into untrusted code.
>

Perhaps I'm misunderstanding, but suspend and the cgroup freezer
interoperate well today -- you don't have to choose one or the other.
If you've discovered otherwise I'd consider it a bug and would like to
hear more about it.

<snip>

> it can handle bad apps better (assuming you don't combine
> opportunistic suspend and cgroup freezing).

I don't see why that would be a problem. The cgroup freezer works
independently of the suspend freezer -- even with suspend blockers.
So my hunch is this is really the same as the next problem you refer to:

> The biggest hurdle is how
> to handle dependencies between processes that gets frozen and
> processes that don't get frozen.

I'm not sure it covers everything you want, but it should be possible to
identify some of those so long as you know which process you're
communicating with.

A trusted app can look up the freezer cgroup of a target app in /proc, then
look at the cgroup's freezer.state file. If it's FREEZING or FROZEN then
you've very likely got a "bad" dependency.

For example, say a trusted app plans on doing a blocking read() to fetch
the output of an untrusted app via a pipe. Assuming we know the untrusted
app's pid we could then check the dependency and determine that we're likely
to block because the untrusted app's freezer cgroup is FREEZING or FROZEN.
(certain to block if we see FROZEN)

That said, it involves quite a few system calls compared to a simple read()
from the pipe. So my guess is it would be a debugging tool at best -- not
something you always have enabled.

It may even be possible to make an lsof-like debugging tool to do that from
outside both apps.

Cheers,
-Matt Helsley

2010-06-05 01:34:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

Arve,

On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:

> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> I kind of agree here, so I'd like to focus a bit on that.
> >>
> >> Here's my idea in the very general terms:
> >>
> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> >> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
> >> ? ? point Android would normally start opportunistic suspend.
> >
> > There is an additional benefit to this approach:
> >
> > ? ? In the current android world a background task (e.g. download
> > ? ? initiated before the screensaver kicked in) prevents the suspend,
> > ? ? but that also means that the crapplications can still suck power
> > ? ? completely unconfined.
> >
>
> Yes this can happen. It is usually only a big problem when you combine
> an (trusted) application that has a bug that blocks suspend forever
> with an application that wakes up too often for us to enter low power
> idle modes.

Why is it a BUG in the trusted app, when I initiate a download and put
the phone down ?

That download might take a minute or two, but that's not an
justification for the crapplication to run unconfined and prevent
lower power states.

> > ? ? With the cgroup freezer you can "suspend" them right away and
> > ? ? just keep the trusted background task(s) alive which allows us to
> > ? ? go into deeper idle states instead of letting the crapplications
> > ? ? run unconfined until the download finished and the suspend
> > ? ? blocker goes away.
> >
>
> Yes this would be better, but I want it in addition to suspend, not
> instead of it. It is also unclear if our user-space code could easily
> make use of it since our trusted code calls into untrusted code.

Sorry, that's really the worst argument I saw in this whole
discussion.

You're basically saying, that you have no idea what your user space
stack is doing and you do not care at all as long as your suspend
blocker scheme makes things work somehow.

Up to that point, I really tried hard to step back from my initial
"OMG, promoting crap is a nono" reaction and work with you on a
sensible technical solution to confine crap and make it aligned with
other efforts in this area.

So now, after I spent a reasonable amount of time (as you did) to
understand what your requirements are, you come up with another
restriction which is so outside of any level of sanity, that I'm at
the point of giving up and just going into NAK mode.

Can you please answer the following question:

What is the point of having the distinction of "trusted" and
"untrusted" when you have no way to prevent "trusted" code calling
"into "untrusted" code ?

That's violating any sense of abstraction and layering and makes it
entirely clear that the only way you can deal with your own design
failure is a big hammer which you need to force into the kernel.

Sorry, no. I'm perfectly willing to make progress on that, as long as
we walk on a sane ground. But abusing the kernel for fixing basic
engineering problems in the user space side of affairs is completely
out of discussion.

> >> I wonder what people think. ?Is this realistic and if so, would it be difficult
> >> to implement?
> >
> > I think it's realistic and not overly complicated to implement.
> >
>
> The kernel support can be easily implemented on most arm hardware, I
> don't know if it can work on most existing x86 hardware. It does not

It does not matter. Even Intel folks told you more than once, that x86
hardware is going to be fixed pretty soon. Hint: that's crucial to
their business ....

> give us the same power savings as suspend with existing software, but
> it can handle bad apps better (assuming you don't combine
> opportunistic suspend and cgroup freezing). The biggest hurdle is how
> to handle dependencies between processes that gets frozen and
> processes that don't get frozen.

See above.

Thanks,

tglx

2010-06-05 05:23:32

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/4 Thomas Gleixner <[email protected]>:
> Arve,
>
> On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>
>> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> >> I kind of agree here, so I'd like to focus a bit on that.
>> >>
>> >> Here's my idea in the very general terms:
>> >>
>> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
>> >> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
>> >> ? ? point Android would normally start opportunistic suspend.
>> >
>> > There is an additional benefit to this approach:
>> >
>> > ? ? In the current android world a background task (e.g. download
>> > ? ? initiated before the screensaver kicked in) prevents the suspend,
>> > ? ? but that also means that the crapplications can still suck power
>> > ? ? completely unconfined.
>> >
>>
>> Yes this can happen. It is usually only a big problem when you combine
>> an (trusted) application that has a bug that blocks suspend forever
>> with an application that wakes up too often for us to enter low power
>> idle modes.
>
> Why is it a BUG in the trusted app, when I initiate a download and put
> the phone down ?
>

It is not, but we have had bugs where a trusted app does not unblock
suspend after some failure case where it is no longer making any
progress.

> That download might take a minute or two, but that's not an
> justification for the crapplication to run unconfined and prevent
> lower power states.
>

I agree, but this is not a simple problem to solve.

>> > ? ? With the cgroup freezer you can "suspend" them right away and
>> > ? ? just keep the trusted background task(s) alive which allows us to
>> > ? ? go into deeper idle states instead of letting the crapplications
>> > ? ? run unconfined until the download finished and the suspend
>> > ? ? blocker goes away.
>> >
>>
>> Yes this would be better, but I want it in addition to suspend, not
>> instead of it. It is also unclear if our user-space code could easily
>> make use of it since our trusted code calls into untrusted code.
>
> Sorry, that's really the worst argument I saw in this whole
> discussion.
>
> You're basically saying, that you have no idea what your user space
> stack is doing and you do not care at all as long as your suspend
> blocker scheme makes things work somehow.
>

Yes I don't know everything our user-space stack is doing, but I do
know that it makes many calls between processes (and in both
directions). As far as I know it uses timeouts when calling into
untrusted code, so a misbehaving application will cause an error
dialog to pop up asking if the user if it should wait longer or
terminate the application.

> Up to that point, I really tried hard to step back from my initial
> "OMG, promoting crap is a nono" reaction and work with you on a
> sensible technical solution to confine crap and make it aligned with
> other efforts in this area.
>
> So now, after I spent a reasonable amount of time (as you did) to
> understand what your requirements are, you come up with another
> restriction which is so outside of any level of sanity, that I'm at
> the point of giving up and just going into NAK mode.
>

I don't think this is a new restriction. Both Brian and I have
mentioned that we have a lot of dependencies between processes.

> Can you please answer the following question:
>
> ? ?What is the point of having the distinction of "trusted" and
> ? ?"untrusted" when you have no way to prevent "trusted" code calling
> ? ?"into "untrusted" code ?
>

Trusted code that calls into untrusted code has to deal with the
untrusted code not responding, but we only want to pop up a message
that the application is not responding if it is misbehaving, not just
because it was frozen though no fault of its own.

> That's violating any sense of abstraction and layering and makes it
> entirely clear that the only way you can deal with your own design
> failure is a big hammer which you need to force into the kernel.
>

How can it be fixed? The user presses the back button, the framework
determines that app A is in the foreground and send the key to app A,
app A decides that it it does not have anything internal to go back to
and tells the framework to switch back to the previous app. If the
user presses the back key again, the framework does not know which app
this key should go to until app A has finished processing the first
key press.

> Sorry, no. I'm perfectly willing to make progress on that, as long as
> we walk on a sane ground. But abusing the kernel for fixing basic
> engineering problems in the user space side of affairs is completely
> out of discussion.
>
>> >> I wonder what people think. ?Is this realistic and if so, would it be difficult
>> >> to implement?
>> >
>> > I think it's realistic and not overly complicated to implement.
>> >
>>
>> The kernel support can be easily implemented on most arm hardware, I
>> don't know if it can work on most existing x86 hardware. It does not
>
> It does not matter. Even Intel folks told you more than once, that x86

How does it not matter. Are dropping support for existing x86 hardware
once the new hardware comes out?

> hardware is going to be fixed pretty soon. Hint: that's crucial to
> their business ....
>
>> give us the same power savings as suspend with existing software, but
>> it can handle bad apps better (assuming you don't combine
>> opportunistic suspend and cgroup freezing). The biggest hurdle is how
>> to handle dependencies between processes that gets frozen and
>> processes that don't get frozen.
>
> See above.
>
> Thanks,
>
> ? ? ? ?tglx



--
Arve Hj?nnev?g

2010-06-05 05:35:11

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/4 Matt Helsley <[email protected]>:
> On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
>> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>
> <snip>
>
>>
>> > ? ? With the cgroup freezer you can "suspend" them right away and
>> > ? ? just keep the trusted background task(s) alive which allows us to
>> > ? ? go into deeper idle states instead of letting the crapplications
>> > ? ? run unconfined until the download finished and the suspend
>> > ? ? blocker goes away.
>> >
>>
>> Yes this would be better, but I want it in addition to suspend, not
>> instead of it. It is also unclear if our user-space code could easily
>> make use of it since our trusted code calls into untrusted code.
>>
>
> Perhaps I'm misunderstanding, but suspend and the cgroup freezer
> interoperate well today -- you don't have to choose one or the other.
> If you've discovered otherwise I'd consider it a bug and would like to
> hear more about it.
>

I'm not aware of any bug with combining both, but we cannot use
suspend at all without suspend blockers in the kernel (since wakeup
events may be ignored) and I don't know how we can safely freeze
cgroups without funneling all potential wakeup events through a
process that never gets frozen.

> <snip>
>
>> it can handle bad apps better (assuming you don't combine
>> opportunistic suspend and cgroup freezing).
>
> I don't see why that would be a problem. The cgroup freezer works
> independently of the suspend freezer -- even with suspend blockers.
> So my hunch is this is really the same as the next problem you refer to:
>
>> The biggest hurdle is how
>> to handle dependencies between processes that gets frozen and
>> processes that don't get frozen.
>
> I'm not sure it covers everything you want, but it should be possible to
> identify some of those so long as you know which process you're
> communicating with.
>
> A trusted app can look up the freezer cgroup of a target app in /proc, then
> look at the cgroup's freezer.state file. If it's FREEZING or FROZEN then
> you've very likely got a "bad" dependency.
>

I don't think they are "bad" dependencies. Our framework has to
communicate with apps.

> For example, say a trusted app plans on doing a blocking read() to fetch
> the output of an untrusted app via a pipe. Assuming we know the untrusted
> app's pid we could then check the dependency and determine that we're likely
> to block because the untrusted app's freezer cgroup is FREEZING or FROZEN.
> (certain to block if we see FROZEN)
>
> That said, it involves quite a few system calls compared to a simple read()
> from the pipe. So my guess is it would be a debugging tool at best -- not
> something you always have enabled.
>
> It may even be possible to make an lsof-like debugging tool to do that from
> outside both apps.
>
> Cheers,
> ? ? ? ?-Matt Helsley
>

--
Arve Hj?nnev?g

2010-06-05 09:54:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 2010-06-04 at 17:10 -0700, Arve Hjønnevåg wrote:
> > Trusted processes are assumed to be sane and idle when there is nothing
> > for them to do, allowing the machine to go into deep idle states.
> >
>
> Neither the kernel nor our trusted user-space code currently meets
> this criteria.

Then both need fixing. Really, that's the only sane approach.

2010-06-05 16:26:26

by Arjan van de Ven

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 05 Jun 2010 11:54:13 +0200
Peter Zijlstra <[email protected]> wrote:

> On Fri, 2010-06-04 at 17:10 -0700, Arve Hjønnevåg wrote:
> > > Trusted processes are assumed to be sane and idle when there is
> > > nothing for them to do, allowing the machine to go into deep idle
> > > states.
> > >
> >
> > Neither the kernel nor our trusted user-space code currently meets
> > this criteria.
>
> Then both need fixing. Really, that's the only sane approach.

fwiw... in MeeGo we're seeing quite good idle times (> 1 seconds)
without really bad hacks.

the kernel has a set of infrastructure already to help here (range
timers, with which you can wakeup-limit untrusted userspace crap),
timer slack for legacy background timers, etc etc.

getting to 10 seconds is not in the range of impossibilities to be
honest... and that's even without doing things like putting untrusted
junk (read: Appstore apps) into a cgroup and do wakeup limiting and cpu
time limiting on a cgroup level....


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-06-05 16:50:35

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:

> 2010/6/4 Thomas Gleixner <[email protected]>:
> > Arve,
> >
> > On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> >
> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> >> I kind of agree here, so I'd like to focus a bit on that.
> >> >>
> >> >> Here's my idea in the very general terms:
> >> >>
> >> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> >> >> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
> >> >> ? ? point Android would normally start opportunistic suspend.
> >> >
> >> > There is an additional benefit to this approach:
> >> >
> >> > ? ? In the current android world a background task (e.g. download
> >> > ? ? initiated before the screensaver kicked in) prevents the suspend,
> >> > ? ? but that also means that the crapplications can still suck power
> >> > ? ? completely unconfined.
> >> >
> >>
> >> Yes this can happen. It is usually only a big problem when you combine
> >> an (trusted) application that has a bug that blocks suspend forever
> >> with an application that wakes up too often for us to enter low power
> >> idle modes.
> >
> > Why is it a BUG in the trusted app, when I initiate a download and put
> > the phone down ?
> >
>
> It is not, but we have had bugs where a trusted app does not unblock
> suspend after some failure case where it is no longer making any
> progress.

Well, that's simply an application bug which sucks battery with or
without suspend blockers. So it's unrelated to the freezing of
untrusted apps while a trusted app still works in the background
before allowing the machine to suspend.

> > That download might take a minute or two, but that's not an
> > justification for the crapplication to run unconfined and prevent
> > lower power states.
> >
>
> I agree, but this is not a simple problem to solve.

Not with suspend blockers, but with cgroup confinement of crap, it's
straight forward.

> >> > ? ? With the cgroup freezer you can "suspend" them right away and
> >> > ? ? just keep the trusted background task(s) alive which allows us to
> >> > ? ? go into deeper idle states instead of letting the crapplications
> >> > ? ? run unconfined until the download finished and the suspend
> >> > ? ? blocker goes away.
> >> >
> >>
> >> Yes this would be better, but I want it in addition to suspend, not
> >> instead of it. It is also unclear if our user-space code could easily
> >> make use of it since our trusted code calls into untrusted code.
> >
> > Sorry, that's really the worst argument I saw in this whole
> > discussion.
> >
> > You're basically saying, that you have no idea what your user space
> > stack is doing and you do not care at all as long as your suspend
> > blocker scheme makes things work somehow.
> >
>
> Yes I don't know everything our user-space stack is doing, but I do
> know that it makes many calls between processes (and in both
> directions). As far as I know it uses timeouts when calling into
> untrusted code, so a misbehaving application will cause an error
> dialog to pop up asking if the user if it should wait longer or
> terminate the application.

Sigh, the more I learn about the details of android and it's violation
of all sane engineering principles the more I understand why you
invented a huge nail to push through all layers in order to bring the
system into idle at all. And yes, you need a sledge hammer to drive
that big nail through everything, so you are using the right tool.

Seriously, the cross app call goes through your framework, which
already knows, that the untrusted part is frozen. So it can deal
nicely with it in any way you want including unfreezing.

> > Up to that point, I really tried hard to step back from my initial
> > "OMG, promoting crap is a nono" reaction and work with you on a
> > sensible technical solution to confine crap and make it aligned with
> > other efforts in this area.
> >
> > So now, after I spent a reasonable amount of time (as you did) to
> > understand what your requirements are, you come up with another
> > restriction which is so outside of any level of sanity, that I'm at
> > the point of giving up and just going into NAK mode.
> >
>
> I don't think this is a new restriction. Both Brian and I have
> mentioned that we have a lot of dependencies between processes.
>
> > Can you please answer the following question:
> >
> > ? ?What is the point of having the distinction of "trusted" and
> > ? ?"untrusted" when you have no way to prevent "trusted" code calling
> > ? ?"into "untrusted" code ?
> >
>
> Trusted code that calls into untrusted code has to deal with the
> untrusted code not responding, but we only want to pop up a message
> that the application is not responding if it is misbehaving, not just
> because it was frozen though no fault of its own.

See above.

> > That's violating any sense of abstraction and layering and makes it
> > entirely clear that the only way you can deal with your own design
> > failure is a big hammer which you need to force into the kernel.
> >
>
> How can it be fixed? The user presses the back button, the framework
> determines that app A is in the foreground and send the key to app A,
> app A decides that it it does not have anything internal to go back to
> and tells the framework to switch back to the previous app. If the
> user presses the back key again, the framework does not know which app
> this key should go to until app A has finished processing the first
> key press.

Errm, what has this to do with frozen apps? If your system is
handling input events then there are no frozen apps and even if they
are frozen your framework can unfreeze them _before_ talking to them.

So which unfixable problem are you describing with the above example ?

Thanks,

tglx

2010-06-05 18:24:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/4 Matt Helsley <[email protected]>:
> > On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >
> > <snip>
> >
> >>
> >> > With the cgroup freezer you can "suspend" them right away and
> >> > just keep the trusted background task(s) alive which allows us to
> >> > go into deeper idle states instead of letting the crapplications
> >> > run unconfined until the download finished and the suspend
> >> > blocker goes away.
> >> >
> >>
> >> Yes this would be better, but I want it in addition to suspend, not
> >> instead of it. It is also unclear if our user-space code could easily
> >> make use of it since our trusted code calls into untrusted code.
> >>
> >
> > Perhaps I'm misunderstanding, but suspend and the cgroup freezer
> > interoperate well today -- you don't have to choose one or the other.
> > If you've discovered otherwise I'd consider it a bug and would like to
> > hear more about it.
> >
>
> I'm not aware of any bug with combining both, but we cannot use
> suspend at all without suspend blockers in the kernel (since wakeup
> events may be ignored)

The more I think of it, the more it appears to me that the problem of
lost wakeup events can actually be solved without suspend blockers.
I'll send a bunch of patches to address this issue, probably tomorrow.

> and I don't know how we can safely freeze
> cgroups without funneling all potential wakeup events through a
> process that never gets frozen.

If your untrusted apps get called by the trusted ones, they aren't really
untrusted in the first place.

>From what you're saying it follows that you're not really willing to accept
any solution different to your suspend blockers. Is that really the case?

Rafael

2010-06-05 20:37:28

by Florian Mickler

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, 3 Jun 2010 19:16:55 -0700 (PDT)
Linus Torvalds <[email protected]> wrote:

> The thing is, unless there is some _really_ deep other reason to do
> something like this, I still think it's total overdesign to push any
> knowledge/choices like this into the scheduler. I'd rather keep things way
> more independent, less tied to each other and to deep kernel subsystems.
>
> IOW, my personal opinion is that somethng like a suspend (blocker or not)
> decision simply shouldn't be important enough to be tied into the
> scheduler. Especially not if it could just be its own layer.
>
> That said, as far as I know, the Android people have mostly been looking
> at the suspend angle from a single-core standpoint. And I'm not at all
> convinced that they should hijack the existing "/sys/power/state" thing
> which is what I think they do now.
>
> And those two things go together. The /sys/power/state thing is a global
> suspend - which I don't think is appropriate for a opportunistic thing in
> the first place, especially for multi-core.
>

This sounds right.

If there is soo much need for a better solution, it will emerge. With
merged suspend blockers or not.

Just my 2 cents.

> Linus

Cheers,
Flo

2010-06-05 21:26:17

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, Jun 5, 2010 at 9:28 AM, Arjan van de Ven <[email protected]> wrote:
> On Sat, 05 Jun 2010 11:54:13 +0200
> Peter Zijlstra <[email protected]> wrote:
>
>> On Fri, 2010-06-04 at 17:10 -0700, Arve Hj?nnev?g wrote:
>> > > Trusted processes are assumed to be sane and idle when there is
>> > > nothing for them to do, allowing the machine to go into deep idle
>> > > states.
>> > >
>> >
>> > Neither the kernel nor our trusted user-space code currently meets
>> > this criteria.
>>
>> Then both need fixing. Really, that's the only sane approach.
>
> fwiw... in MeeGo we're seeing quite good idle times (> 1 seconds)
> without really bad hacks.
>

We clearly have different standards for what we consider good. We
measure time suspended in minutes or hours, not seconds, and waking up
every second or two causes a noticeable decrease in battery life on
the hardware we have today.

> the kernel has a set of infrastructure already to help here (range
> timers, with which you can wakeup-limit untrusted userspace crap),
> timer slack for legacy background timers, etc etc.

Range timers allows the kernel to align different timers so they don't
each bring the cpu out of idle individually. They do not eliminate
timers or make individual timers fire less often. For example if you
have 10 timers that fire every second on an idle system, without range
timers you will most likely have to bring the cpu out of idle 10 times
a second, but with range timers you have a chance of waking up only
once a second (I say a chance here, since if they are identical they
will just chase each other and never catch up).

>
> getting to 10 seconds is not in the range of impossibilities to be
> honest... and that's even without doing things like putting untrusted

That is still far short of what we get with suspend (in terms of time).

> junk (read: Appstore apps) into a cgroup and do wakeup limiting and cpu
> time limiting on a cgroup level....
>
>
> --
> Arjan van de Ven ? ? ? ?Intel Open Source Technology Centre
> For development, discussion and tips for power savings,
> visit http://www.lesswatts.org
>



--
Arve Hj?nnev?g

2010-06-05 21:47:28

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Thomas Gleixner <[email protected]>:
> B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>
>> 2010/6/4 Thomas Gleixner <[email protected]>:
>> > Arve,
>> >
>> > On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>> >
>> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> >> >> I kind of agree here, so I'd like to focus a bit on that.
>> >> >>
>> >> >> Here's my idea in the very general terms:
>> >> >>
>> >> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
>> >> >> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
>> >> >> ? ? point Android would normally start opportunistic suspend.
>> >> >
>> >> > There is an additional benefit to this approach:
>> >> >
>> >> > ? ? In the current android world a background task (e.g. download
>> >> > ? ? initiated before the screensaver kicked in) prevents the suspend,
>> >> > ? ? but that also means that the crapplications can still suck power
>> >> > ? ? completely unconfined.
>> >> >
>> >>
>> >> Yes this can happen. It is usually only a big problem when you combine
>> >> an (trusted) application that has a bug that blocks suspend forever
>> >> with an application that wakes up too often for us to enter low power
>> >> idle modes.
>> >
>> > Why is it a BUG in the trusted app, when I initiate a download and put
>> > the phone down ?
>> >
>>
>> It is not, but we have had bugs where a trusted app does not unblock
>> suspend after some failure case where it is no longer making any
>> progress.
>
> Well, that's simply an application bug which sucks battery with or
> without suspend blockers. So it's unrelated to the freezing of
> untrusted apps while a trusted app still works in the background
> before allowing the machine to suspend.
>

It is not unrelated if the trusted app has stopped working but still
blocks suspend. The battery drains when you combine them.

>> > That download might take a minute or two, but that's not an
>> > justification for the crapplication to run unconfined and prevent
>> > lower power states.
>> >
>>
>> I agree, but this is not a simple problem to solve.
>
> Not with suspend blockers, but with cgroup confinement of crap, it's
> straight forward.
>

I don't think is is straight forward. If the a process in the frozen
group holds a resource that a process in the unfrozen group needs, how
do deal with that?

>> >> > ? ? With the cgroup freezer you can "suspend" them right away and
>> >> > ? ? just keep the trusted background task(s) alive which allows us to
>> >> > ? ? go into deeper idle states instead of letting the crapplications
>> >> > ? ? run unconfined until the download finished and the suspend
>> >> > ? ? blocker goes away.
>> >> >
>> >>
>> >> Yes this would be better, but I want it in addition to suspend, not
>> >> instead of it. It is also unclear if our user-space code could easily
>> >> make use of it since our trusted code calls into untrusted code.
>> >
>> > Sorry, that's really the worst argument I saw in this whole
>> > discussion.
>> >
>> > You're basically saying, that you have no idea what your user space
>> > stack is doing and you do not care at all as long as your suspend
>> > blocker scheme makes things work somehow.
>> >
>>
>> Yes I don't know everything our user-space stack is doing, but I do
>> know that it makes many calls between processes (and in both
>> directions). As far as I know it uses timeouts when calling into
>> untrusted code, so a misbehaving application will cause an error
>> dialog to pop up asking if the user if it should wait longer or
>> terminate the application.
>
> Sigh, the more I learn about the details of android and it's violation
> of all sane engineering principles the more I understand why you
> invented a huge nail to push through all layers in order to bring the
> system into idle at all. And yes, you need a sledge hammer to drive
> that big nail through everything, so you are using the right tool.
>
> Seriously, the cross app call goes through your framework, which
> already knows, that the untrusted part is frozen. So it can deal
> nicely with it in any way you want including unfreezing.

Cross app calls do not go through a central process.

>
>> > Up to that point, I really tried hard to step back from my initial
>> > "OMG, promoting crap is a nono" reaction and work with you on a
>> > sensible technical solution to confine crap and make it aligned with
>> > other efforts in this area.
>> >
>> > So now, after I spent a reasonable amount of time (as you did) to
>> > understand what your requirements are, you come up with another
>> > restriction which is so outside of any level of sanity, that I'm at
>> > the point of giving up and just going into NAK mode.
>> >
>>
>> I don't think this is a new restriction. Both Brian and I have
>> mentioned that we have a lot of dependencies between processes.
>>
>> > Can you please answer the following question:
>> >
>> > ? ?What is the point of having the distinction of "trusted" and
>> > ? ?"untrusted" when you have no way to prevent "trusted" code calling
>> > ? ?"into "untrusted" code ?
>> >
>>
>> Trusted code that calls into untrusted code has to deal with the
>> untrusted code not responding, but we only want to pop up a message
>> that the application is not responding if it is misbehaving, not just
>> because it was frozen though no fault of its own.
>
> See above.
>
>> > That's violating any sense of abstraction and layering and makes it
>> > entirely clear that the only way you can deal with your own design
>> > failure is a big hammer which you need to force into the kernel.
>> >
>>
>> How can it be fixed? The user presses the back button, the framework
>> determines that app A is in the foreground and send the key to app A,
>> app A decides that it it does not have anything internal to go back to
>> and tells the framework to switch back to the previous app. If the
>> user presses the back key again, the framework does not know which app
>> this key should go to until app A has finished processing the first
>> key press.
>
> Errm, what has this to do with frozen apps? If your system is
> handling input events then there are no frozen apps and even if they
> are frozen your framework can unfreeze them _before_ talking to them.
>
> So which unfixable problem are you describing with the above example ?
>

You are claiming that trusted code should not have any dependencies on
untrusted code. I gave you a visible example of such a dependency and
want you to tell me how you can avoid this dependency. Since you are
claiming that our user-space framework is fundamentally broken if it
has to wait for untrusted code, I don't think it is unreasonable for
you to answer this. Or do you think it is valid to communicate with
untrusted code when the screen is on but not when it is off.

--
Arve Hj?nnev?g

2010-06-05 22:10:48

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Rafael J. Wysocki <[email protected]>:
> On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
>> 2010/6/4 Matt Helsley <[email protected]>:
>> > On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
>> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> >
>> > <snip>
>> >
>> >>
>> >> > ? ? With the cgroup freezer you can "suspend" them right away and
>> >> > ? ? just keep the trusted background task(s) alive which allows us to
>> >> > ? ? go into deeper idle states instead of letting the crapplications
>> >> > ? ? run unconfined until the download finished and the suspend
>> >> > ? ? blocker goes away.
>> >> >
>> >>
>> >> Yes this would be better, but I want it in addition to suspend, not
>> >> instead of it. It is also unclear if our user-space code could easily
>> >> make use of it since our trusted code calls into untrusted code.
>> >>
>> >
>> > Perhaps I'm misunderstanding, but suspend and the cgroup freezer
>> > interoperate well today -- you don't have to choose one or the other.
>> > If you've discovered otherwise I'd consider it a bug and would like to
>> > hear more about it.
>> >
>>
>> I'm not aware of any bug with combining both, but we cannot use
>> suspend at all without suspend blockers in the kernel (since wakeup
>> events may be ignored)
>
> The more I think of it, the more it appears to me that the problem of
> lost wakeup events can actually be solved without suspend blockers.
> I'll send a bunch of patches to address this issue, probably tomorrow.
>

I know of two ways to prevent lost wakeup events. Reset a timeout
every time you receive a wakeup event or prevents suspend until you
know the event has been fully processed. Does your solution fall onto
one of these two categories, or do you have a third way?

>> and I don't know how we can safely freeze
>> cgroups without funneling all potential wakeup events through a
>> process that never gets frozen.
>
> If your untrusted apps get called by the trusted ones, they aren't really
> untrusted in the first place.
>
That is not a correct statement. A trusted apps can call into an
untrusted app, it just has to validate the response and handle not
getting a response at all. There are also different levels of trust. I
may have trusted an app to provide a contact pictures, but not trusted
it to block suspend. When the phone rings the app will be called to
provide the picture for the incoming call dialog, but if it is frozen
at this point the more trusted app that handles the incoming phone
call will not be able to get the picture.

> From what you're saying it follows that you're not really willing to accept
> any solution different to your suspend blockers. ?Is that really the case?
>
I don't think that is a fair way to put it. We need to support our
user-space framework and I have not seen an alternative solution that
clearly will work (other than replacing suspend_blockers with pm_qos
constraints that do the same thing).

--
Arve Hj?nnev?g

2010-06-05 22:12:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> >> > Why is it a BUG in the trusted app, when I initiate a download and put
> >> > the phone down ?
> >> >
> >>
> >> It is not, but we have had bugs where a trusted app does not unblock
> >> suspend after some failure case where it is no longer making any
> >> progress.
> >
> > Well, that's simply an application bug which sucks battery with or
> > without suspend blockers. So it's unrelated to the freezing of
> > untrusted apps while a trusted app still works in the background
> > before allowing the machine to suspend.
> >
>
> It is not unrelated if the trusted app has stopped working but still
> blocks suspend. The battery drains when you combine them.

What you are describing is a problem which is not solvable either way.
If you take the lock and do not release it you're not going to
suspend. I never claimed that any other mechanism resolves this.

But this is not related to the fact that freezing crap while running a
sane background task is going to save you power vs. an approach where
running a sane background task allows crap to consume power unconfined
until it is done.

> >> > That download might take a minute or two, but that's not an
> >> > justification for the crapplication to run unconfined and prevent
> >> > lower power states.
> >> >
> >>
> >> I agree, but this is not a simple problem to solve.
> >
> > Not with suspend blockers, but with cgroup confinement of crap, it's
> > straight forward.
> >
>
> I don't think is is straight forward. If the a process in the frozen
> group holds a resource that a process in the unfrozen group needs, how
> do deal with that?

I'm going to fix the framework which puts the group into freeze state
w/o making sure that there is no held shared resource. Come on it's
not rocket science.

> >> Yes I don't know everything our user-space stack is doing, but I do
> >> know that it makes many calls between processes (and in both
> >> directions). As far as I know it uses timeouts when calling into
> >> untrusted code, so a misbehaving application will cause an error
> >> dialog to pop up asking if the user if it should wait longer or
> >> terminate the application.
> >
> > Sigh, the more I learn about the details of android and it's violation
> > of all sane engineering principles the more I understand why you
> > invented a huge nail to push through all layers in order to bring the
> > system into idle at all. And yes, you need a sledge hammer to drive
> > that big nail through everything, so you are using the right tool.
> >
> > Seriously, the cross app call goes through your framework, which
> > already knows, that the untrusted part is frozen. So it can deal
> > nicely with it in any way you want including unfreezing.
>
> Cross app calls do not go through a central process.

It's not about a central process, it goes through your framework,
which should be able to deal with it. If not, it's a design failure
which needs to be fixed at the place where the failure happened.

> >>
> >> How can it be fixed? The user presses the back button, the framework
> >> determines that app A is in the foreground and send the key to app A,
> >> app A decides that it it does not have anything internal to go back to
> >> and tells the framework to switch back to the previous app. If the
> >> user presses the back key again, the framework does not know which app
> >> this key should go to until app A has finished processing the first
> >> key press.
> >
> > Errm, what has this to do with frozen apps? If your system is
> > handling input events then there are no frozen apps and even if they
> > are frozen your framework can unfreeze them _before_ talking to them.
> >
> > So which unfixable problem are you describing with the above example ?
> >
>
> You are claiming that trusted code should not have any dependencies on
> untrusted code. I gave you a visible example of such a dependency and
> want you to tell me how you can avoid this dependency. Since you are
> claiming that our user-space framework is fundamentally broken if it
> has to wait for untrusted code, I don't think it is unreasonable for
> you to answer this. Or do you think it is valid to communicate with
> untrusted code when the screen is on but not when it is off.

It does not matter whether the screen is off or not. If you need to
call into that untrusted app from your trusted app and you know about
the might be frozen state then you can deal with it.

So taking your example:

Event happens and gets delivered to the framework

framework selects A because it is in the foreground

if (A is frozen)
unfreeze(A)

deliver_event_to(A)

It's that simple.

If your framework cannot deal with that simple problem then you have a
much more serious problem already.

Thanks,

tglx

2010-06-05 22:20:53

by Arjan van de Ven

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010 14:26:14 -0700
Arve Hjønnevåg <[email protected]> wrote:

> On Sat, Jun 5, 2010 at 9:28 AM, Arjan van de Ven
> <[email protected]> wrote:
> > On Sat, 05 Jun 2010 11:54:13 +0200
> > Peter Zijlstra <[email protected]> wrote:
> >
> >> On Fri, 2010-06-04 at 17:10 -0700, Arve Hjønnevåg wrote:
> >> > > Trusted processes are assumed to be sane and idle when there is
> >> > > nothing for them to do, allowing the machine to go into deep
> >> > > idle states.
> >> > >
> >> >
> >> > Neither the kernel nor our trusted user-space code currently
> >> > meets this criteria.
> >>
> >> Then both need fixing. Really, that's the only sane approach.
> >
> > fwiw... in MeeGo we're seeing quite good idle times (> 1 seconds)
> > without really bad hacks.
> >
>
> We clearly have different standards for what we consider good. We
> measure time suspended in minutes or hours, not seconds, and waking up
> every second or two causes a noticeable decrease in battery life on
> the hardware we have today.

I guess I'm spoiled working with (unreleased) hardware that knows how
to power gate ;-)


>
> > the kernel has a set of infrastructure already to help here (range
> > timers, with which you can wakeup-limit untrusted userspace crap),
> > timer slack for legacy background timers, etc etc.
>
> Range timers allows the kernel to align different timers so they don't
> each bring the cpu out of idle individually. They do not eliminate
> timers or make individual timers fire less often.

you're incorrect.
With range timers you can control the rate at which timers fire just
fine.

For example if the Adobe Flash player puts a timer every 10
milliseconds (yes it does that), and you give it a 3.99 seconds range,
it will fire its timers every 4 seconds.... unless other activity
happens independently, at which point it'll align with that instead.



--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-06-05 22:26:42

by Brian Swetland

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, Jun 5, 2010 at 3:23 PM, Arjan van de Ven <[email protected]> wrote:
>>
>> We clearly have different standards for what we consider good. We
>> measure time suspended in minutes or hours, not seconds, and waking up
>> every second or two causes a noticeable decrease in battery life on
>> the hardware we have today.
>
> I guess I'm spoiled working with (unreleased) hardware that knows how
> to power gate ;-)

I'm continually surprised by answers like this. We run on hardware
that power gates very aggressively and draws in the neighborhood of
1-2mA at the battery when in the lowest state (3-5mA while the radio
is connected to the network and paging). Waking up out of that lowest
state and executing code every few seconds or (worse) several times a
second) will raise your average power consumption. Being able to stay
parked at the very bottom for minutes or hours at a time when nothing
"interesting" is happening is very useful and can have a significant
impact on overall battery life.

Brian

2010-06-05 22:39:46

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Arjan van de Ven <[email protected]>:
> On Sat, 5 Jun 2010 14:26:14 -0700
> Arve Hj?nnev?g <[email protected]> wrote:
>
>> On Sat, Jun 5, 2010 at 9:28 AM, Arjan van de Ven
>> <[email protected]> wrote:
>> > On Sat, 05 Jun 2010 11:54:13 +0200
>> > Peter Zijlstra <[email protected]> wrote:
>> >
>> >> On Fri, 2010-06-04 at 17:10 -0700, Arve Hj?nnev?g wrote:
>> >> > > Trusted processes are assumed to be sane and idle when there is
>> >> > > nothing for them to do, allowing the machine to go into deep
>> >> > > idle states.
>> >> > >
>> >> >
>> >> > Neither the kernel nor our trusted user-space code currently
>> >> > meets this criteria.
>> >>
>> >> Then both need fixing. Really, that's the only sane approach.
>> >
>> > fwiw... in MeeGo we're seeing quite good idle times (> 1 seconds)
>> > without really bad hacks.
>> >
>>
>> We clearly have different standards for what we consider good. We
>> measure time suspended in minutes or hours, not seconds, and waking up
>> every second or two causes a noticeable decrease in battery life on
>> the hardware we have today.
>
> I guess I'm spoiled working with (unreleased) hardware that knows how
> to power gate ;-)
>
>
>>
>> > the kernel has a set of infrastructure already to help here (range
>> > timers, with which you can wakeup-limit untrusted userspace crap),
>> > timer slack for legacy background timers, etc etc.
>>
>> Range timers allows the kernel to align different timers so they don't
>> each bring the cpu out of idle individually. They do not eliminate
>> timers or make individual timers fire less often.
>
> you're incorrect.
> With range timers you can control the rate at which timers fire just
> fine.
>
> For example if the Adobe Flash player puts a timer every 10
> milliseconds (yes it does that), and you give it a 3.99 seconds range,
> it will fire its timers every 4 seconds.... unless other activity
> happens independently, at which point it'll align with that instead.
>

If you do that what you are delivering is nowhere close to what the
app asked for. You don't need range timers for this, you could just as
well add 4 seconds to all normal timers.

--
Arve Hj?nnev?g

2010-06-05 22:43:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> >
> >> 2010/6/4 Thomas Gleixner <[email protected]>:
> >> > Arve,
> >> >
> >> > On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> >> >
> >> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> >> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> >> >> I kind of agree here, so I'd like to focus a bit on that.
> >> >> >>
> >> >> >> Here's my idea in the very general terms:
> >> >> >>
> >> >> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
> >> >> >> that don't use suspend blockers aka wakelocks in the Android world) at the
> >> >> >> point Android would normally start opportunistic suspend.
> >> >> >
> >> >> > There is an additional benefit to this approach:
> >> >> >
> >> >> > In the current android world a background task (e.g. download
> >> >> > initiated before the screensaver kicked in) prevents the suspend,
> >> >> > but that also means that the crapplications can still suck power
> >> >> > completely unconfined.
> >> >> >
> >> >>
> >> >> Yes this can happen. It is usually only a big problem when you combine
> >> >> an (trusted) application that has a bug that blocks suspend forever
> >> >> with an application that wakes up too often for us to enter low power
> >> >> idle modes.
> >> >
> >> > Why is it a BUG in the trusted app, when I initiate a download and put
> >> > the phone down ?
> >> >
> >>
> >> It is not, but we have had bugs where a trusted app does not unblock
> >> suspend after some failure case where it is no longer making any
> >> progress.
> >
> > Well, that's simply an application bug which sucks battery with or
> > without suspend blockers. So it's unrelated to the freezing of
> > untrusted apps while a trusted app still works in the background
> > before allowing the machine to suspend.
> >
>
> It is not unrelated if the trusted app has stopped working but still
> blocks suspend. The battery drains when you combine them.
>
> >> > That download might take a minute or two, but that's not an
> >> > justification for the crapplication to run unconfined and prevent
> >> > lower power states.
> >> >
> >>
> >> I agree, but this is not a simple problem to solve.
> >
> > Not with suspend blockers, but with cgroup confinement of crap, it's
> > straight forward.
> >
>
> I don't think is is straight forward. If the a process in the frozen
> group holds a resource that a process in the unfrozen group needs, how
> do deal with that?

That depends a good deal on what you mean by holding a resource.

Generally, however, if your "trusted" processes depend on the processes you
don't trust, then either the former should not be trusted, or the latter should
be trusted.

> >> >> > With the cgroup freezer you can "suspend" them right away and
> >> >> > just keep the trusted background task(s) alive which allows us to
> >> >> > go into deeper idle states instead of letting the crapplications
> >> >> > run unconfined until the download finished and the suspend
> >> >> > blocker goes away.
> >> >> >
> >> >>
> >> >> Yes this would be better, but I want it in addition to suspend, not
> >> >> instead of it. It is also unclear if our user-space code could easily
> >> >> make use of it since our trusted code calls into untrusted code.
> >> >
> >> > Sorry, that's really the worst argument I saw in this whole
> >> > discussion.
> >> >
> >> > You're basically saying, that you have no idea what your user space
> >> > stack is doing and you do not care at all as long as your suspend
> >> > blocker scheme makes things work somehow.
> >> >
> >>
> >> Yes I don't know everything our user-space stack is doing, but I do
> >> know that it makes many calls between processes (and in both
> >> directions). As far as I know it uses timeouts when calling into
> >> untrusted code, so a misbehaving application will cause an error
> >> dialog to pop up asking if the user if it should wait longer or
> >> terminate the application.
> >
> > Sigh, the more I learn about the details of android and it's violation
> > of all sane engineering principles the more I understand why you
> > invented a huge nail to push through all layers in order to bring the
> > system into idle at all. And yes, you need a sledge hammer to drive
> > that big nail through everything, so you are using the right tool.
> >
> > Seriously, the cross app call goes through your framework, which
> > already knows, that the untrusted part is frozen. So it can deal
> > nicely with it in any way you want including unfreezing.
>
> Cross app calls do not go through a central process.

Well, yeah.

Arve, we're still learning you have some more requirements we had no idea
about before and such that _only_ the suspend blockers (or wakelocks) framework
is suitable to satisfy them. I don't realistically think we can make any
progress this way.

> >> > Up to that point, I really tried hard to step back from my initial
> >> > "OMG, promoting crap is a nono" reaction and work with you on a
> >> > sensible technical solution to confine crap and make it aligned with
> >> > other efforts in this area.
> >> >
> >> > So now, after I spent a reasonable amount of time (as you did) to
> >> > understand what your requirements are, you come up with another
> >> > restriction which is so outside of any level of sanity, that I'm at
> >> > the point of giving up and just going into NAK mode.
> >> >
> >>
> >> I don't think this is a new restriction. Both Brian and I have
> >> mentioned that we have a lot of dependencies between processes.

Which is not the same as "the dependencies are such that they can't be
taken into account in any way other than by using wakelocks (or suspend
blockers)".

> >> > Can you please answer the following question:
> >> >
> >> > What is the point of having the distinction of "trusted" and
> >> > "untrusted" when you have no way to prevent "trusted" code calling
> >> > "into "untrusted" code ?
> >> >
> >>
> >> Trusted code that calls into untrusted code has to deal with the
> >> untrusted code not responding, but we only want to pop up a message
> >> that the application is not responding if it is misbehaving, not just
> >> because it was frozen though no fault of its own.

When Android starts opportunistic suspend, all applications are frozen,
"trusted" as well as "untrusted", right? So, after they are all frozen, none
of them can do anything to prevent suspend from happening, right?

Now, in my proposed approach the "untrusted" apps are frozen exactly at the
point Android would start opportunistic suspend and they wouldn't be able
to do anything about that anyway. So if one of your "trusted" apps depends
on the "untrusted" ones in a way that you describe, you alread have a bug
(the "trusted" app cannot prevent automatic suspend from happening even if it
wants, because it depends on an "untrusted" app that has just been frozen).

> >> > That's violating any sense of abstraction and layering and makes it
> >> > entirely clear that the only way you can deal with your own design
> >> > failure is a big hammer which you need to force into the kernel.
> >> >
> >>
> >> How can it be fixed? The user presses the back button, the framework
> >> determines that app A is in the foreground and send the key to app A,
> >> app A decides that it it does not have anything internal to go back to
> >> and tells the framework to switch back to the previous app. If the
> >> user presses the back key again, the framework does not know which app
> >> this key should go to until app A has finished processing the first
> >> key press.
> >
> > Errm, what has this to do with frozen apps? If your system is
> > handling input events then there are no frozen apps and even if they
> > are frozen your framework can unfreeze them _before_ talking to them.
> >
> > So which unfixable problem are you describing with the above example ?
> >
>
> You are claiming that trusted code should not have any dependencies on
> untrusted code.

Not "any". It shouldn't have dependencies that make a difference between
"trusted" and "untrusted".

Think of security, for example. A root-owned process surely can exchange data
with processes owned by non-root users, but it shouldn't blindly accept any
data these processes give it.

Your wakelock-holding application is a counterpart of the root-owned process
above. It can exchange data with processes that don't take wakelocks, but not
in such a way that would prevent them from taking wakelocks if necessary
(or from dropping wakelocks if no longer needed from their point of view).

If this condition is satisfied, then I claim you won't have any problems with
freezing the "untrusted" apps upfront. If this condition is not satisfied, in
turn, your framework already doesn't work.

Rafael

2010-06-05 22:45:54

by Arjan van de Ven

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010 15:26:36 -0700
Brian Swetland <[email protected]> wrote:

>
> I'm continually surprised by answers like this. We run on hardware
> that power gates very aggressively and draws in the neighborhood of
> 1-2mA at the battery when in the lowest state (3-5mA while the radio
> is connected to the network and paging). Waking up out of that lowest
> state and executing code every few seconds or (worse) several times a
> second) will raise your average power consumption. Being able to stay
> parked at the very bottom for minutes or hours at a time when nothing
> "interesting" is happening is very useful and can have a significant
> impact on overall battery life.

It's relatively simple math.

If you wake up for a burst of work, you burn power at the higher level
P1 (versus the lower power level P2), for, lets say an average time T,
with a relatively small T (few milliseconds at most).

If you wake up X times per second (with X being a fractional number, so
can be smaller than 1) the extra power consumption factor is

X * T * P1
-------------------------------
X * T * P1 + (1.0 - X * T) * P2

if you draw a graph of this, for real values of P and T, there's a real
point where you hit diminishing returns.

if say T is 5 milliseconds (that's a high amount), and X is 1
wakeup/second, then there's already a 200:1 ratio in time an power.

If X goes to once every 10 seconds (not unreasonable, especially since
any real device will pull email and stuff in the backgroudn), you have
2000:1 time and power ratios...

Unless your "on" power is insane high (and hopefully it's not, since
you're not turning on the whole device obviously, you do selective
power and clock gating)... that "divide by 200 or 2000" makes the whole
problem go away.. in the "seconds" range for really low power devices.
Not in "hours" range.


On laptops (which have much more poor powermanagement) this point is
around 40 milliseconds or so.. but on phone silicon that I've seen,
both Intel and others, this is in the 1 to 5 seconds range.





--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-06-05 22:51:03

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sunday 06 June 2010, Brian Swetland wrote:
> On Sat, Jun 5, 2010 at 3:23 PM, Arjan van de Ven <[email protected]> wrote:
> >>
> >> We clearly have different standards for what we consider good. We
> >> measure time suspended in minutes or hours, not seconds, and waking up
> >> every second or two causes a noticeable decrease in battery life on
> >> the hardware we have today.
> >
> > I guess I'm spoiled working with (unreleased) hardware that knows how
> > to power gate ;-)
>
> I'm continually surprised by answers like this. We run on hardware
> that power gates very aggressively and draws in the neighborhood of
> 1-2mA at the battery when in the lowest state (3-5mA while the radio
> is connected to the network and paging). Waking up out of that lowest
> state and executing code every few seconds or (worse) several times a
> second) will raise your average power consumption. Being able to stay
> parked at the very bottom for minutes or hours at a time when nothing
> "interesting" is happening is very useful and can have a significant
> impact on overall battery life.

Yes, and if you look at the approach I proposed in this very thread
(http://lkml.org/lkml/2010/6/4/368), it goes exactly in this direction.

And I think it is superior to the opportunistic suspend framework you have
right now, because, for example, it doesn't require you to carry out full
system resume and full system suspend every once a while to check battery
status.

And guess what, suspending and resuming the whole system actually uses energy.

Rafael

2010-06-05 23:02:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Rafael J. Wysocki <[email protected]>:
> > On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/4 Matt Helsley <[email protected]>:
> >> > On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
> >> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> >> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> >
> >> > <snip>
> >> >
> >> >>
> >> >> > With the cgroup freezer you can "suspend" them right away and
> >> >> > just keep the trusted background task(s) alive which allows us to
> >> >> > go into deeper idle states instead of letting the crapplications
> >> >> > run unconfined until the download finished and the suspend
> >> >> > blocker goes away.
> >> >> >
> >> >>
> >> >> Yes this would be better, but I want it in addition to suspend, not
> >> >> instead of it. It is also unclear if our user-space code could easily
> >> >> make use of it since our trusted code calls into untrusted code.
> >> >>
> >> >
> >> > Perhaps I'm misunderstanding, but suspend and the cgroup freezer
> >> > interoperate well today -- you don't have to choose one or the other.
> >> > If you've discovered otherwise I'd consider it a bug and would like to
> >> > hear more about it.
> >> >
> >>
> >> I'm not aware of any bug with combining both, but we cannot use
> >> suspend at all without suspend blockers in the kernel (since wakeup
> >> events may be ignored)
> >
> > The more I think of it, the more it appears to me that the problem of
> > lost wakeup events can actually be solved without suspend blockers.
> > I'll send a bunch of patches to address this issue, probably tomorrow.
> >
>
> I know of two ways to prevent lost wakeup events. Reset a timeout
> every time you receive a wakeup event or prevents suspend until you
> know the event has been fully processed. Does your solution fall onto
> one of these two categories, or do you have a third way?

Basically, it involves two mechanisms, detection of wakeup events occuring
right before suspend is started and aborting suspend if wakeup events occur
in the middle of it.

> >> and I don't know how we can safely freeze
> >> cgroups without funneling all potential wakeup events through a
> >> process that never gets frozen.
> >
> > If your untrusted apps get called by the trusted ones, they aren't really
> > untrusted in the first place.
> >
> That is not a correct statement. A trusted apps can call into an
> untrusted app, it just has to validate the response and handle not
> getting a response at all. There are also different levels of trust. I
> may have trusted an app to provide a contact pictures, but not trusted
> it to block suspend. When the phone rings the app will be called to
> provide the picture for the incoming call dialog, but if it is frozen
> at this point the more trusted app that handles the incoming phone
> call will not be able to get the picture.

It will be able to do that if it causes the frozen part of user space to be
thawed.

I think you have this problem already, though, because you use full system
suspend and all of your apps are frozen by it. So, to handle the situation you
describe above, you need to carry out full system resume that will thaw the
tasks for you. I don't see any fundamental difference betwee the two cases.

> > From what you're saying it follows that you're not really willing to accept
> > any solution different to your suspend blockers. Is that really the case?
> >
> I don't think that is a fair way to put it. We need to support our
> user-space framework and I have not seen an alternative solution that
> clearly will work (other than replacing suspend_blockers with pm_qos
> constraints that do the same thing).

Then think again of the approach I proposed and explain to me why it won't
work, because I haven't seen any convincing argument on that yet.

Rafael

2010-06-05 23:21:10

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>> >> > Why is it a BUG in the trusted app, when I initiate a download and put
>> >> > the phone down ?
>> >> >
>> >>
>> >> It is not, but we have had bugs where a trusted app does not unblock
>> >> suspend after some failure case where it is no longer making any
>> >> progress.
>> >
>> > Well, that's simply an application bug which sucks battery with or
>> > without suspend blockers. So it's unrelated to the freezing of
>> > untrusted apps while a trusted app still works in the background
>> > before allowing the machine to suspend.
>> >
>>
>> It is not unrelated if the trusted app has stopped working but still
>> blocks suspend. The battery drains when you combine them.
>
> What you are describing is a problem which is not solvable either way.
> If you take the lock and do not release it you're not going to
> suspend. I never claimed that any other mechanism resolves this.
>
Whether you claimed it or not, this is the only case where using
cgroups would have a significant power saving over what we get with
suspend. The trusted app is idle and the untrusted app is frozen, so
we enter a low power mode from idle.

> But this is not related to the fact that freezing crap while running a
> sane background task is going to save you power vs. an approach where
> running a sane background task allows crap to consume power unconfined
> until it is done.
>
If the task that is blocking suspend is using the cpu anyway, then the
bad app does not increase the power consumption nearly as much as if
the task that blocked suspend is idle.

>> >> > That download might take a minute or two, but that's not an
>> >> > justification for the crapplication to run unconfined and prevent
>> >> > lower power states.
>> >> >
>> >>
>> >> I agree, but this is not a simple problem to solve.
>> >
>> > Not with suspend blockers, but with cgroup confinement of crap, it's
>> > straight forward.
>> >
>>
>> I don't think is is straight forward. If the a process in the frozen
>> group holds a resource that a process in the unfrozen group needs, how
>> do deal with that?
>
> I'm going to fix the framework which puts the group into freeze state
> w/o making sure that there is no held shared resource. Come on it's
> not rocket science.
>

I'm not sure which framework you are talking about here, but I don't
think there is a single framework that knows about all shared
resources.

>> >> Yes I don't know everything our user-space stack is doing, but I do
>> >> know that it makes many calls between processes (and in both
>> >> directions). As far as I know it uses timeouts when calling into
>> >> untrusted code, so a misbehaving application will cause an error
>> >> dialog to pop up asking if the user if it should wait longer or
>> >> terminate the application.
>> >
>> > Sigh, the more I learn about the details of android and it's violation
>> > of all sane engineering principles the more I understand why you
>> > invented a huge nail to push through all layers in order to bring the
>> > system into idle at all. And yes, you need a sledge hammer to drive
>> > that big nail through everything, so you are using the right tool.
>> >
>> > Seriously, the cross app call goes through your framework, which
>> > already knows, that the untrusted part is frozen. So it can deal
>> > nicely with it in any way you want including unfreezing.
>>
>> Cross app calls do not go through a central process.
>
> It's not about a central process, it goes through your framework,
> which should be able to deal with it. If not, it's a design failure
> which needs to be fixed at the place where the failure happened.
>
>> >>
>> >> How can it be fixed? The user presses the back button, the framework
>> >> determines that app A is in the foreground and send the key to app A,
>> >> app A decides that it it does not have anything internal to go back to
>> >> and tells the framework to switch back to the previous app. If the
>> >> user presses the back key again, the framework does not know which app
>> >> this key should go to until app A has finished processing the first
>> >> key press.
>> >
>> > Errm, what has this to do with frozen apps? If your system is
>> > handling input events then there are no frozen apps and even if they
>> > are frozen your framework can unfreeze them _before_ talking to them.
>> >
>> > So which unfixable problem are you describing with the above example ?
>> >
>>
>> You are claiming that trusted code should not have any dependencies on
>> untrusted code. I gave you a visible example of such a dependency and
>> want you to tell me how you can avoid this dependency. Since you are
>> claiming that our user-space framework is fundamentally broken if it
>> has to wait for untrusted code, I don't think it is unreasonable for
>> you to answer this. Or do you think it is valid to communicate with
>> untrusted code when the screen is on but not when it is off.
>
> It does not matter whether the screen is off or not. If you need to
> call into that untrusted app from your trusted app and you know about
> the might be frozen state then you can deal with it.
>
> So taking your example:
>
> Event happens and gets delivered to the framework
>
> ? ? ?framework selects A because it is in the foreground
>
> ? ? ?if (A is frozen)
> ? ? ? ? unfreeze(A)
>
> ? ? ?deliver_event_to(A)
>
> It's that simple.
>

That is too simple. You also have to prevent A from being frozen while
it is processing the event or the result would be the same as if it
was frozen beforehand.

> If your framework cannot deal with that simple problem then you have a
> much more serious problem already.
>
> Thanks,
>
> ? ? ? ?tglx
>



--
Arve Hj?nnev?g

2010-06-05 23:31:47

by Arjan van de Ven

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010 15:39:44 -0700
Arve Hjønnevåg <[email protected]> wrote:

> >
> > For example if the Adobe Flash player puts a timer every 10
> > milliseconds (yes it does that), and you give it a 3.99 seconds
> > range, it will fire its timers every 4 seconds.... unless other
> > activity happens independently, at which point it'll align with
> > that instead.
> >
>
> If you do that what you are delivering is nowhere close to what the
> app asked for.

yeah it feels a little bit suspended

> You don't need range timers for this, you could just as
> well add 4 seconds to all normal timers.

.. with the difference that with range timers, you naturally align with
other activity, so if there's system level activity, the AVERAGE service
the app gets is better by a LOT than just adding 4 seconds always.

but you knew that.... just doesn't help your case.



--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-06-05 23:37:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/5 Thomas Gleixner <[email protected]>:
> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
...
> > So taking your example:
> >
> > Event happens and gets delivered to the framework
> >
> > framework selects A because it is in the foreground
> >
> > if (A is frozen)
> > unfreeze(A)
> >
> > deliver_event_to(A)
> >
> > It's that simple.
> >
>
> That is too simple. You also have to prevent A from being frozen while
> it is processing the event or the result would be the same as if it
> was frozen beforehand.

Well, the freezing of the "untrusted" part of user space needs to be triggered
somehow in the first place. Whatever mechanism is used for that, there should
be a way to tell it to not to freeze the "untrusted" part of user space for a
while. Yes, it is similar to wakelocks, but I think it can be implemented
entirely in user space.

So, in general, the "trusted" app that needs an "untrusted" one to handle stuff
will take a "freeze lock" to prevent the power manager from freezing the
"untrusted" part of user space (that will also cause it to thaw these tasks if
they are frozen at the moment) and will release the "freeze lock" when it's
done with its job. You can use timeouts and whatever you like with that and
the kernel doesn't have to participate in that (except for carrying out the
low-level freezing and thawing of the "untrusted" tasks at the power manager's
request).

Rafael

2010-06-05 23:46:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arjan van de Ven wrote:

> On Sat, 5 Jun 2010 15:26:36 -0700
> Brian Swetland <[email protected]> wrote:
>
> >
> > I'm continually surprised by answers like this. We run on hardware
> > that power gates very aggressively and draws in the neighborhood of
> > 1-2mA at the battery when in the lowest state (3-5mA while the radio
> > is connected to the network and paging). Waking up out of that lowest
> > state and executing code every few seconds or (worse) several times a
> > second) will raise your average power consumption. Being able to stay
> > parked at the very bottom for minutes or hours at a time when nothing
> > "interesting" is happening is very useful and can have a significant
> > impact on overall battery life.
>
> It's relatively simple math.
>
> If you wake up for a burst of work, you burn power at the higher level
> P1 (versus the lower power level P2), for, lets say an average time T,
> with a relatively small T (few milliseconds at most).
>
> If you wake up X times per second (with X being a fractional number, so
> can be smaller than 1) the extra power consumption factor is
>
> X * T * P1
> -------------------------------
> X * T * P1 + (1.0 - X * T) * P2
>
> if you draw a graph of this, for real values of P and T, there's a real
> point where you hit diminishing returns.
>
> if say T is 5 milliseconds (that's a high amount), and X is 1
> wakeup/second, then there's already a 200:1 ratio in time an power.
>
> If X goes to once every 10 seconds (not unreasonable, especially since
> any real device will pull email and stuff in the backgroudn), you have
> 2000:1 time and power ratios...
>
> Unless your "on" power is insane high (and hopefully it's not, since
> you're not turning on the whole device obviously, you do selective
> power and clock gating)... that "divide by 200 or 2000" makes the whole
> problem go away.. in the "seconds" range for really low power devices.
> Not in "hours" range.

That's the whole problem. Suspend blockers are a binary all on/off
approach so you waste power just to get the thing back to
"suspend". They unleash the world and some more just to put it back
into oblivion with brute force.

Thanks,

tglx

2010-06-05 23:56:42

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Rafael J. Wysocki <[email protected]>:
> On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>> >
>> >> 2010/6/4 Thomas Gleixner <[email protected]>:
>> >> > Arve,
>> >> >
>> >> > On Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>> >> >
>> >> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> >> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> >> >> >> I kind of agree here, so I'd like to focus a bit on that.
>> >> >> >>
>> >> >> >> Here's my idea in the very general terms:
>> >> >> >>
>> >> >> >> (1) Use the cgroup freezer to "suspend" the "untrusted" apps (ie. the ones
>> >> >> >> ? ? that don't use suspend blockers aka wakelocks in the Android world) at the
>> >> >> >> ? ? point Android would normally start opportunistic suspend.
>> >> >> >
>> >> >> > There is an additional benefit to this approach:
>> >> >> >
>> >> >> > ? ? In the current android world a background task (e.g. download
>> >> >> > ? ? initiated before the screensaver kicked in) prevents the suspend,
>> >> >> > ? ? but that also means that the crapplications can still suck power
>> >> >> > ? ? completely unconfined.
>> >> >> >
>> >> >>
>> >> >> Yes this can happen. It is usually only a big problem when you combine
>> >> >> an (trusted) application that has a bug that blocks suspend forever
>> >> >> with an application that wakes up too often for us to enter low power
>> >> >> idle modes.
>> >> >
>> >> > Why is it a BUG in the trusted app, when I initiate a download and put
>> >> > the phone down ?
>> >> >
>> >>
>> >> It is not, but we have had bugs where a trusted app does not unblock
>> >> suspend after some failure case where it is no longer making any
>> >> progress.
>> >
>> > Well, that's simply an application bug which sucks battery with or
>> > without suspend blockers. So it's unrelated to the freezing of
>> > untrusted apps while a trusted app still works in the background
>> > before allowing the machine to suspend.
>> >
>>
>> It is not unrelated if the trusted app has stopped working but still
>> blocks suspend. The battery drains when you combine them.
>>
>> >> > That download might take a minute or two, but that's not an
>> >> > justification for the crapplication to run unconfined and prevent
>> >> > lower power states.
>> >> >
>> >>
>> >> I agree, but this is not a simple problem to solve.
>> >
>> > Not with suspend blockers, but with cgroup confinement of crap, it's
>> > straight forward.
>> >
>>
>> I don't think is is straight forward. If the a process in the frozen
>> group holds a resource that a process in the unfrozen group needs, how
>> do deal with that?
>
> That depends a good deal on what you mean by holding a resource.
>

I'm not sure why that matters. Any resource held by a frozen process
could cause problems whether it is something like being the current
foreground app or a mutex in shared memory.

> Generally, however, if your "trusted" processes depend on the processes you
> don't trust, then either the former should not be trusted, or the latter should
> be trusted.
>
There are different levels of trust. Trusted processes often need to
wait for an untrusted process to release a resource for the untrusted
process to behave correctly, but the trusted process can revoke the
resource if the untrusted process does not comply in time.

>> >> >> > ? ? With the cgroup freezer you can "suspend" them right away and
>> >> >> > ? ? just keep the trusted background task(s) alive which allows us to
>> >> >> > ? ? go into deeper idle states instead of letting the crapplications
>> >> >> > ? ? run unconfined until the download finished and the suspend
>> >> >> > ? ? blocker goes away.
>> >> >> >
>> >> >>
>> >> >> Yes this would be better, but I want it in addition to suspend, not
>> >> >> instead of it. It is also unclear if our user-space code could easily
>> >> >> make use of it since our trusted code calls into untrusted code.
>> >> >
>> >> > Sorry, that's really the worst argument I saw in this whole
>> >> > discussion.
>> >> >
>> >> > You're basically saying, that you have no idea what your user space
>> >> > stack is doing and you do not care at all as long as your suspend
>> >> > blocker scheme makes things work somehow.
>> >> >
>> >>
>> >> Yes I don't know everything our user-space stack is doing, but I do
>> >> know that it makes many calls between processes (and in both
>> >> directions). As far as I know it uses timeouts when calling into
>> >> untrusted code, so a misbehaving application will cause an error
>> >> dialog to pop up asking if the user if it should wait longer or
>> >> terminate the application.
>> >
>> > Sigh, the more I learn about the details of android and it's violation
>> > of all sane engineering principles the more I understand why you
>> > invented a huge nail to push through all layers in order to bring the
>> > system into idle at all. And yes, you need a sledge hammer to drive
>> > that big nail through everything, so you are using the right tool.
>> >
>> > Seriously, the cross app call goes through your framework, which
>> > already knows, that the untrusted part is frozen. So it can deal
>> > nicely with it in any way you want including unfreezing.
>>
>> Cross app calls do not go through a central process.
>
> Well, yeah.
>
> Arve, we're still learning you have some more requirements we had no idea

What new requirement are you talking about. Did you assume all our
user-space ipc calls went though a single process?

> about before and such that _only_ the suspend blockers (or wakelocks) framework
> is suitable to satisfy them. ?I don't realistically think we can make any
> progress this way.
>
>> >> > Up to that point, I really tried hard to step back from my initial
>> >> > "OMG, promoting crap is a nono" reaction and work with you on a
>> >> > sensible technical solution to confine crap and make it aligned with
>> >> > other efforts in this area.
>> >> >
>> >> > So now, after I spent a reasonable amount of time (as you did) to
>> >> > understand what your requirements are, you come up with another
>> >> > restriction which is so outside of any level of sanity, that I'm at
>> >> > the point of giving up and just going into NAK mode.
>> >> >
>> >>
>> >> I don't think this is a new restriction. Both Brian and I have
>> >> mentioned that we have a lot of dependencies between processes.
>
> Which is not the same as "the dependencies are such that they can't be
> taken into account in any way other than by using wakelocks (or suspend
> blockers)".
>
>> >> > Can you please answer the following question:
>> >> >
>> >> > ? ?What is the point of having the distinction of "trusted" and
>> >> > ? ?"untrusted" when you have no way to prevent "trusted" code calling
>> >> > ? ?"into "untrusted" code ?
>> >> >
>> >>
>> >> Trusted code that calls into untrusted code has to deal with the
>> >> untrusted code not responding, but we only want to pop up a message
>> >> that the application is not responding if it is misbehaving, not just
>> >> because it was frozen though no fault of its own.
>
> When Android starts opportunistic suspend, all applications are frozen,
> "trusted" as well as "untrusted", right? ?So, after they are all frozen, none
> of them can do anything to prevent suspend from happening, right?

Not if you mean when we write to /sys/power/state. Processes are not
frozen until the last suspend blocker is released.

>
> Now, in my proposed approach the "untrusted" apps are frozen exactly at the
> point Android would start opportunistic suspend and they wouldn't be able
> to do anything about that anyway. ?So if one of your "trusted" apps depends
> on the "untrusted" ones in a way that you describe, you alread have a bug
> (the "trusted" app cannot prevent automatic suspend from happening even if it
> wants, because it depends on an "untrusted" app that has just been frozen).
>

I don't think what you said here is correct. If a wakeup event happens
all processed are unfrozen since the driver blocks suspend. The app
that reads this event blocks suspend before reading it. If it was busy
talking to a less trusted app when the event happened it still works
since all apps are running at this point.

>> >> > That's violating any sense of abstraction and layering and makes it
>> >> > entirely clear that the only way you can deal with your own design
>> >> > failure is a big hammer which you need to force into the kernel.
>> >> >
>> >>
>> >> How can it be fixed? The user presses the back button, the framework
>> >> determines that app A is in the foreground and send the key to app A,
>> >> app A decides that it it does not have anything internal to go back to
>> >> and tells the framework to switch back to the previous app. If the
>> >> user presses the back key again, the framework does not know which app
>> >> this key should go to until app A has finished processing the first
>> >> key press.
>> >
>> > Errm, what has this to do with frozen apps? If your system is
>> > handling input events then there are no frozen apps and even if they
>> > are frozen your framework can unfreeze them _before_ talking to them.
>> >
>> > So which unfixable problem are you describing with the above example ?
>> >
>>
>> You are claiming that trusted code should not have any dependencies on
>> untrusted code.
>
> Not "any". ?It shouldn't have dependencies that make a difference between
> "trusted" and "untrusted".
>
> Think of security, for example. ?A root-owned process surely can exchange data
> with processes owned by non-root users, but it shouldn't blindly accept any
> data these processes give it.
>
> Your wakelock-holding application is a counterpart of the root-owned process
> above. ?It can exchange data with processes that don't take wakelocks, but not
> in such a way that would prevent them from taking wakelocks if necessary
> (or from dropping wakelocks if no longer needed from their point of view).
>
> If this condition is satisfied, then I claim you won't have any problems with
> freezing the "untrusted" apps upfront. ?If this condition is not satisfied, in
> turn, your framework already doesn't work.

The problem is that properly working untrusted apps may get treated as
non-working apps and killed because they were frozen and did not
respond. Also this is not invisible to the user as the system usually
gives the app several seconds to respond.

--
Arve Hj?nnev?g

2010-06-06 00:02:18

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Arjan van de Ven <[email protected]>:
> On Sat, 5 Jun 2010 15:39:44 -0700
> Arve Hj?nnev?g <[email protected]> wrote:
>
>> >
>> > For example if the Adobe Flash player puts a timer every 10
>> > milliseconds (yes it does that), and you give it a 3.99 seconds
>> > range, it will fire its timers every 4 seconds.... unless other
>> > activity happens independently, at which point it'll align with
>> > that instead.
>> >
>>
>> If you do that what you are delivering is nowhere close to what the
>> app asked for.
>
> yeah it feels a little bit suspended
>
>> You don't need range timers for this, you could just as
>> well add 4 seconds to all normal timers.
>
> .. with the difference that with range timers, you naturally align with
> other activity, so if there's system level activity, the AVERAGE service
> the app gets is better by a LOT than just adding 4 seconds always.
>
> but you knew that.... just doesn't help your case.

So you are saying it is safe to use range timers to radically change
the requested timer interval because it does not actually get to the
value that you changed it so. But you are also saying that this will
allow the system to stay idle for that long. Something does not add
up.

--
Arve Hj?nnev?g

2010-06-06 00:06:32

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> >> > That download might take a minute or two, but that's not an
> >> >> > justification for the crapplication to run unconfined and prevent
> >> >> > lower power states.
> >> >> >
> >> >>
> >> >> I agree, but this is not a simple problem to solve.
> >> >
> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
> >> > straight forward.
> >> >
> >>
> >> I don't think is is straight forward. If the a process in the frozen
> >> group holds a resource that a process in the unfrozen group needs, how
> >> do deal with that?
> >
> > I'm going to fix the framework which puts the group into freeze state
> > w/o making sure that there is no held shared resource. Come on it's
> > not rocket science.
> >
>
> I'm not sure which framework you are talking about here, but I don't
> think there is a single framework that knows about all shared
> resources.

Damn, it's not me talking about "our framework", you are mentioning
when it fits your needs.

If you do not have a clearly defined user space framework, then we
talk about a completely random conglomeration of applications which
need to be brought into submission by some global brute force
approach.

I'm tired of this, really. You just use terminlology as it fits to
defend the complete design failure of android. But you fail to trick
me :)

Can you please explain in a consistent way how the application stack
and the underlying framework (which exists according to android docs)
is handling events and how the separation of trust level works ?

We need to know that, otherwise we turn in circles forever.

Thanks,

tglx

2010-06-06 00:20:45

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> >> > Well, that's simply an application bug which sucks battery with or
> >> > without suspend blockers. So it's unrelated to the freezing of
> >> > untrusted apps while a trusted app still works in the background
> >> > before allowing the machine to suspend.
> >> >
> >>
> >> It is not unrelated if the trusted app has stopped working but still
> >> blocks suspend. The battery drains when you combine them.
> >
> > What you are describing is a problem which is not solvable either way.
> > If you take the lock and do not release it you're not going to
> > suspend. I never claimed that any other mechanism resolves this.
> >
> Whether you claimed it or not, this is the only case where using
> cgroups would have a significant power saving over what we get with
> suspend. The trusted app is idle and the untrusted app is frozen, so
> we enter a low power mode from idle.

Nothing else was what I said and depending on the usage pattern this
can be significant. Just you converted a perfectly sensible technical
argument into a quibble about BUGs in applicatins which are not
confinable by defintion.

> > But this is not related to the fact that freezing crap while running a
> > sane background task is going to save you power vs. an approach where
> > running a sane background task allows crap to consume power unconfined
> > until it is done.
> >
> If the task that is blocking suspend is using the cpu anyway, then the
> bad app does not increase the power consumption nearly as much as if
> the task that blocked suspend is idle.

That's utter bullshit. If the app missed to release the supsend
blocker then your crappy "while(1);" app is killing you in no time,
while the same frozen crappy "while(1);" does no harm at all.

Thanks,

tglx

2010-06-06 00:33:35

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/5 Thomas Gleixner <[email protected]>:
> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> >> Cross app calls do not go through a central process.
> >
> > It's not about a central process, it goes through your framework,
> > which should be able to deal with it. If not, it's a design failure
> > which needs to be fixed at the place where the failure happened.
> >
> >> >>
> >> >> How can it be fixed? The user presses the back button, the framework
> >> >> determines that app A is in the foreground and send the key to app A,
> >> >> app A decides that it it does not have anything internal to go back to
> >> >> and tells the framework to switch back to the previous app. If the
> >> >> user presses the back key again, the framework does not know which app
> >> >> this key should go to until app A has finished processing the first
> >> >> key press.
> >> >
> >> > Errm, what has this to do with frozen apps? If your system is
> >> > handling input events then there are no frozen apps and even if they
> >> > are frozen your framework can unfreeze them _before_ talking to them.
> >> >
> >> > So which unfixable problem are you describing with the above example ?
> >> >
> >>
> >> You are claiming that trusted code should not have any dependencies on
> >> untrusted code. I gave you a visible example of such a dependency and
> >> want you to tell me how you can avoid this dependency. Since you are
> >> claiming that our user-space framework is fundamentally broken if it
> >> has to wait for untrusted code, I don't think it is unreasonable for
> >> you to answer this. Or do you think it is valid to communicate with
> >> untrusted code when the screen is on but not when it is off.
> >
> > It does not matter whether the screen is off or not. If you need to
> > call into that untrusted app from your trusted app and you know about
> > the might be frozen state then you can deal with it.
> >
> > So taking your example:
> >
> > Event happens and gets delivered to the framework
> >
> > ? ? ?framework selects A because it is in the foreground
> >
> > ? ? ?if (A is frozen)
> > ? ? ? ? unfreeze(A)
> >
> > ? ? ?deliver_event_to(A)
> >
> > It's that simple.
> >
>
> That is too simple. You also have to prevent A from being frozen while
> it is processing the event or the result would be the same as if it
> was frozen beforehand.

The framework decides when to freeze the app in the first place (as
your framework does now when it decides to suspend)

So it knows whether the app is frozen or not.

So it knows damend well whether it processed the event or not.

Thanks,

tglx

2010-06-06 00:34:30

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, Jun 5, 2010 at 3:48 PM, Arjan van de Ven <[email protected]> wrote:
> On Sat, 5 Jun 2010 15:26:36 -0700
> Brian Swetland <[email protected]> wrote:
>
>>
>> I'm continually surprised by answers like this. ?We run on hardware
>> that power gates very aggressively and draws in the neighborhood of
>> 1-2mA at the battery when in the lowest state (3-5mA while the radio
>> is connected to the network and paging). ?Waking up out of that lowest
>> state and executing code every few seconds or (worse) several times a
>> second) will raise your average power consumption. ?Being able to stay
>> parked at the very bottom for minutes or hours at a time when nothing
>> "interesting" is happening is very useful and can have a significant
>> impact on overall battery life.
>
> It's relatively simple math.
>
> If you wake up for a burst of work, you burn power at the higher level
> P1 (versus the lower power level P2), for, lets say an average time T,
> with a relatively small T (few milliseconds at most).
>
> If you wake up X times per second (with X being a fractional number, so
> can be smaller than 1) the extra power consumption factor is
>
> ? ? ? ? ?X * T * P1
> -------------------------------
> X * T * P1 + (1.0 - X * T) * P2
>
> if you draw a graph of this, for real values of P and T, there's a real
> point where you hit diminishing returns.
>
> if say T is 5 milliseconds (that's a high amount), and X is 1
> wakeup/second, then there's already a 200:1 ratio in time an power.
>

It is a 200:1 ratio in time not in power.

> If X goes to once every 10 seconds (not unreasonable, especially since
> any real device will pull email and stuff in the backgroudn), you have
> 2000:1 time and power ratios...
>
> Unless your "on" power is insane high (and hopefully it's not, since

The absolute "on" power is not relevant to the ratio, the difference
between on and off power is. This can easily be 100:1.

> you're not turning on the whole device obviously, you do selective
> power and clock gating)... that "divide by 200 or 2000" makes the whole
> problem go away.. in the "seconds" range for really low power devices.
> Not in "hours" range.
>

If you improve the low power state, compared to the "on" state wakeup
gets worse, not better, but yes the phone hardware we have now does
not need to stay idle for hours to get good battery life, the msm
hardware at least needs to stay idle for more than a few seconds.

>
> On laptops (which have much more poor powermanagement) this point is
> around 40 milliseconds or so.. but on phone silicon that I've seen,
> both Intel and others, this is in the 1 to 5 seconds range.
>
>
>
>
>
> --
> Arjan van de Ven ? ? ? ?Intel Open Source Technology Centre
> For development, discussion and tips for power savings,
> visit http://www.lesswatts.org
>



--
Arve Hj?nnev?g

2010-06-06 01:03:14

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Rafael J. Wysocki <[email protected]>:
> On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Rafael J. Wysocki <[email protected]>:
>> > On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
>> >> 2010/6/4 Matt Helsley <[email protected]>:
>> >> > On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
>> >> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
>> >> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
>> >> >
>> >> > <snip>
>> >> >
>> >> >>
>> >> >> > ? ? With the cgroup freezer you can "suspend" them right away and
>> >> >> > ? ? just keep the trusted background task(s) alive which allows us to
>> >> >> > ? ? go into deeper idle states instead of letting the crapplications
>> >> >> > ? ? run unconfined until the download finished and the suspend
>> >> >> > ? ? blocker goes away.
>> >> >> >
>> >> >>
>> >> >> Yes this would be better, but I want it in addition to suspend, not
>> >> >> instead of it. It is also unclear if our user-space code could easily
>> >> >> make use of it since our trusted code calls into untrusted code.
>> >> >>
>> >> >
>> >> > Perhaps I'm misunderstanding, but suspend and the cgroup freezer
>> >> > interoperate well today -- you don't have to choose one or the other.
>> >> > If you've discovered otherwise I'd consider it a bug and would like to
>> >> > hear more about it.
>> >> >
>> >>
>> >> I'm not aware of any bug with combining both, but we cannot use
>> >> suspend at all without suspend blockers in the kernel (since wakeup
>> >> events may be ignored)
>> >
>> > The more I think of it, the more it appears to me that the problem of
>> > lost wakeup events can actually be solved without suspend blockers.
>> > I'll send a bunch of patches to address this issue, probably tomorrow.
>> >
>>
>> I know of two ways to prevent lost wakeup events. Reset a timeout
>> every time you receive a wakeup event or prevents suspend until you
>> know the event has been fully processed. Does your solution fall onto
>> one of these two categories, or do you have a third way?
>
> Basically, it involves two mechanisms, detection of wakeup events occuring
> right before suspend is started

This sounds like the timeout approach which I thought you did not like.

> and aborting suspend if wakeup events occur
> in the middle of it.
>
Aborting suspend is easy, but when do you allow suspend again?

>> >> and I don't know how we can safely freeze
>> >> cgroups without funneling all potential wakeup events through a
>> >> process that never gets frozen.
>> >
>> > If your untrusted apps get called by the trusted ones, they aren't really
>> > untrusted in the first place.
>> >
>> That is not a correct statement. A trusted apps can call into an
>> untrusted app, it just has to validate the response and handle not
>> getting a response at all. There are also different levels of trust. I
>> may have trusted an app to provide a contact pictures, but not trusted
>> it to block suspend. When the phone rings the app will be called to
>> provide the picture for the incoming call dialog, but if it is frozen
>> at this point the more trusted app that handles the incoming phone
>> call will not be able to get the picture.
>
> It will be able to do that if it causes the frozen part of user space to be
> thawed.
>
> I think you have this problem already, though, because you use full system
> suspend and all of your apps are frozen by it. ?So, to handle the situation you
> describe above, you need to carry out full system resume that will thaw the
> tasks for you. ?I don't see any fundamental difference betwee the two cases.
>

Yes, we can keep all our user space suspend blockers and thaw the
frozen cgroup when any suspend blocker is held, but this would
eliminate any power advantage that freezing a cgroup has over using
suspend to freeze all processes. Without annotating the drivers to
block the cgroup freezing in the same places as we now block suspend,
it also prevents processes in the cgroup that we freeze from directly
consuming wakup events.

>> > From what you're saying it follows that you're not really willing to accept
>> > any solution different to your suspend blockers. ?Is that really the case?
>> >
>> I don't think that is a fair way to put it. We need to support our
>> user-space framework and I have not seen an alternative solution that
>> clearly will work (other than replacing suspend_blockers with pm_qos
>> constraints that do the same thing).
>
> Then think again of the approach I proposed and explain to me why it won't
> work, because I haven't seen any convincing argument on that yet.
>

If you are referring to the approach that we don't use suspend but
freeze a cgroup instead, this only solves the problem of bad apps. It
does not help pause timers in trusted user space code and in the
kernel, so it does not lower our average power consumption. And, it
does not solve the problem for systems that enters lower power states
from suspend than it can from idle. The last point my not be relevant
to android anymore, but desktop systems already have auto suspend and
it would be preferable to have a race free kernel api for this.

--
Arve Hj?nnev?g

2010-06-06 01:16:51

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> >> >> > That download might take a minute or two, but that's not an
>> >> >> > justification for the crapplication to run unconfined and prevent
>> >> >> > lower power states.
>> >> >> >
>> >> >>
>> >> >> I agree, but this is not a simple problem to solve.
>> >> >
>> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
>> >> > straight forward.
>> >> >
>> >>
>> >> I don't think is is straight forward. If the a process in the frozen
>> >> group holds a resource that a process in the unfrozen group needs, how
>> >> do deal with that?
>> >
>> > I'm going to fix the framework which puts the group into freeze state
>> > w/o making sure that there is no held shared resource. Come on it's
>> > not rocket science.
>> >
>>
>> I'm not sure which framework you are talking about here, but I don't
>> think there is a single framework that knows about all shared
>> resources.
>
> Damn, it's not me talking about "our framework", you are mentioning
> when it fits your needs.

You said you were going to fix the framework. I did know if you were
talking about the cgroup framework, or the android user-space
frameworks. I don't think either has knowledge about all shared
resources.

>
> If you do not have a clearly defined user space framework, then we
> talk about a completely random conglomeration of applications which
> need to be brought into submission by some global brute force
> approach.
>
> I'm tired of this, really. You just use terminlology as it fits to
> defend the complete design failure of android. But you fail to trick
> me :)
>
> Can you please explain in a consistent way how the application stack
> and the underlying framework (which exists according to android docs)
> is handling events and how the separation of trust level works ?
>

I don't think I can, since I only know small parts of it. I know some
events like input event go though a single thread in our system
process, while other events like network packets (which are also
wakeup events) goes directly to the app.

> We need to know that, otherwise we turn in circles forever.
>
> Thanks,
>
> ? ? ? ?tglx



--
Arve Hj?nnev?g

2010-06-06 01:24:45

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> >> > Well, that's simply an application bug which sucks battery with or
>> >> > without suspend blockers. So it's unrelated to the freezing of
>> >> > untrusted apps while a trusted app still works in the background
>> >> > before allowing the machine to suspend.
>> >> >
>> >>
>> >> It is not unrelated if the trusted app has stopped working but still
>> >> blocks suspend. The battery drains when you combine them.
>> >
>> > What you are describing is a problem which is not solvable either way.
>> > If you take the lock and do not release it you're not going to
>> > suspend. I never claimed that any other mechanism resolves this.
>> >
>> Whether you claimed it or not, this is the only case where using
>> cgroups would have a significant power saving over what we get with
>> suspend. The trusted app is idle and the untrusted app is frozen, so
>> we enter a low power mode from idle.
>
> Nothing else was what I said and depending on the usage pattern this
> can be significant. Just you converted a perfectly sensible technical
> argument into a quibble about BUGs in applicatins which are not
> confinable by defintion.
>
>> > But this is not related to the fact that freezing crap while running a
>> > sane background task is going to save you power vs. an approach where
>> > running a sane background task allows crap to consume power unconfined
>> > until it is done.
>> >
>> If the task that is blocking suspend is using the cpu anyway, then the
>> bad app does not increase the power consumption nearly as much as if
>> the task that blocked suspend is idle.
>
> That's utter bullshit. If the app missed to release the supsend
> blocker then your crappy "while(1);" app is killing you in no time,
> while the same frozen crappy "while(1);" does no harm at all.
>
This is the bug I described above. If the app that blocked suspend did
not release the suspend blocker and went idle, then another while(1)
app will drain the battery. If the app that blocked suspend only
blocked suspend while it needs to run (which is the typical reason to
block suspend) then the system is not idle anyway and the impact of
the while(1) app is much less severe.

--
Arve Hj?nnev?g

2010-06-06 01:46:01

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> >> 2010/6/5 Thomas Gleixner <[email protected]>:
>> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
>> >> Cross app calls do not go through a central process.
>> >
>> > It's not about a central process, it goes through your framework,
>> > which should be able to deal with it. If not, it's a design failure
>> > which needs to be fixed at the place where the failure happened.
>> >
>> >> >>
>> >> >> How can it be fixed? The user presses the back button, the framework
>> >> >> determines that app A is in the foreground and send the key to app A,
>> >> >> app A decides that it it does not have anything internal to go back to
>> >> >> and tells the framework to switch back to the previous app. If the
>> >> >> user presses the back key again, the framework does not know which app
>> >> >> this key should go to until app A has finished processing the first
>> >> >> key press.
>> >> >
>> >> > Errm, what has this to do with frozen apps? If your system is
>> >> > handling input events then there are no frozen apps and even if they
>> >> > are frozen your framework can unfreeze them _before_ talking to them.
>> >> >
>> >> > So which unfixable problem are you describing with the above example ?
>> >> >
>> >>
>> >> You are claiming that trusted code should not have any dependencies on
>> >> untrusted code. I gave you a visible example of such a dependency and
>> >> want you to tell me how you can avoid this dependency. Since you are
>> >> claiming that our user-space framework is fundamentally broken if it
>> >> has to wait for untrusted code, I don't think it is unreasonable for
>> >> you to answer this. Or do you think it is valid to communicate with
>> >> untrusted code when the screen is on but not when it is off.
>> >
>> > It does not matter whether the screen is off or not. If you need to
>> > call into that untrusted app from your trusted app and you know about
>> > the might be frozen state then you can deal with it.
>> >
>> > So taking your example:
>> >
>> > Event happens and gets delivered to the framework
>> >
>> > ? ? ?framework selects A because it is in the foreground
>> >
>> > ? ? ?if (A is frozen)
>> > ? ? ? ? unfreeze(A)
>> >
>> > ? ? ?deliver_event_to(A)
>> >
>> > It's that simple.
>> >
>>
>> That is too simple. You also have to prevent A from being frozen while
>> it is processing the event or the result would be the same as if it
>> was frozen beforehand.
>
> The framework decides when to freeze the app in the first place (as
> your framework does now when it decides to suspend)
>
> ? ? So it knows whether the app is frozen or not.
>
> ? ? So it knows damend well whether it processed the event or not.
>

Our user-space code is not single-threaded. So just because an app was
not frozen when you checked does not mean it will remain unfrozen. We
can use the same user-space wakelock api we have now to prevent
freezing apps instead of preventing suspend, but we loose any
advantage we get from freezing just a subset of processes this way.

--
Arve Hj?nnev?g

2010-06-06 02:49:12

by Alan Stern

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:

> Yes, we can keep all our user space suspend blockers and thaw the
> frozen cgroup when any suspend blocker is held, but this would
> eliminate any power advantage that freezing a cgroup has over using
> suspend to freeze all processes. Without annotating the drivers to
> block the cgroup freezing in the same places as we now block suspend,
> it also prevents processes in the cgroup that we freeze from directly
> consuming wakup events.

The driver annotations don't need to block the cgroup freezing. They
just need to keep the system running long enough to awaken a thread
that will handle the wakeup event. (See below.) A pm-qos constraint
is good enough for this.

> If you are referring to the approach that we don't use suspend but
> freeze a cgroup instead, this only solves the problem of bad apps. It
> does not help pause timers in trusted user space code and in the
> kernel, so it does not lower our average power consumption.

You can solve this problem if you restructure your "trusted" apps in
the right way. Require a trusted app to guarantee that whenever it
doesn't hold any suspend blockers, it will do nothing but wait (in a
poll() system call for example) for a wakeup event. When the event
occurs, it must then activate a suspend blocker.

Better yet, make it more fine-grained. Instead of trusted apps, have
trusted threads. Freeze the untrusted threads along with everything
else, and require the trusted threads to satisfy this guarantee.

In this way, while the system is idle no user timers will get renewed.
Kernel timers are another matter, but we should be able to handle them.
There's nothing Android-specific about wanting to reduce kernel timer
wakeups while in a low-power mode.

> And, it
> does not solve the problem for systems that enters lower power states
> from suspend than it can from idle. The last point my not be relevant
> to android anymore, but desktop systems already have auto suspend and
> it would be preferable to have a race free kernel api for this.

This is an entirely different matter from the rest of the discussion.
It would be better to consider this separately after Android's current
problems have been addressed.

Alan Stern

2010-06-06 07:52:12

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/5 Arve Hj?nnev?g <[email protected]>:
> On Sat, Jun 5, 2010 at 9:28 AM, Arjan van de Ven <[email protected]> wrote:
>> On Sat, 05 Jun 2010 11:54:13 +0200
>> Peter Zijlstra <[email protected]> wrote:
>>
>>> On Fri, 2010-06-04 at 17:10 -0700, Arve Hj?nnev?g wrote:
>>> > > Trusted processes are assumed to be sane and idle when there is
>>> > > nothing for them to do, allowing the machine to go into deep idle
>>> > > states.
>>> > >
>>> >
>>> > Neither the kernel nor our trusted user-space code currently meets
>>> > this criteria.
>>>
>>> Then both need fixing. Really, that's the only sane approach.
>>
>> fwiw... in MeeGo we're seeing quite good idle times (> 1 seconds)
>> without really bad hacks.
>>
>
> We clearly have different standards for what we consider good. We
> measure time suspended in minutes or hours, not seconds, and waking up
> every second or two causes a noticeable decrease in battery life on
> the hardware we have today.

Are you stating that the existing Android implementation enters the
suspended state for hours for any of the existing designs?

~Vitaly

2010-06-06 08:06:47

by David Lang

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Fri, 4 Jun 2010, Brian Swetland wrote:

> Yeah, I do understand that we're not making it easy for ourselves
> here. I think we hit the point where Rafael and Matthew signed off on
> things and thought "aha, linux-pm maintainers are happy, now we're
> getting somewhere" only to realize the light at the end of the tunnel
> was a bit further out than we anticipated ^^

What you missed is that the linux-pm maintainers have relativly little
weight in getting things into the kernel. They are gatekeeper, so until
they approve it there is basically no chance of getting in, but even
changes that they develop and push frequently have a uphill battle to get
into the kernel, especially if they would end up touching all drivers.
There have been several proposals by the pm team that have been shot down
much more completely than wavelocks.

David Lang

2010-06-06 08:17:39

by David Lang

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, 3 Jun 2010, Arjan van de Ven wrote:

> On Thu, 3 Jun 2010 19:26:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
>>
>> If the system is idle (or almost idle) for long times, I would
>> heartily recommend actively shutting down unused cores. Some CPU's
>> are hopefully smart enough to not even need that kind of software
>> management, but I suspect even the really smart ones might be able to
>> take advantage of the kernel saying: "I'm shutting you down, you
>> don't have to worry about latency AT ALL, because I'm keeping another
>> CPU active to do any real work".
>
> sadly the reality is that "offline" is actually the same as "deepest C
> state". At best.
>
> As far as I can see, this is at least true for all Intel and AMD cpus.
>
> And because there's then no power saving (but a performance cost), it's
> actually a negative for battery life/total energy.

I believe that this assumes you are in the 'race to idle' situation where
when you finish your work you can shutdown. If the work is ongoing you may
never shutdown.

Also, what about the new CPUs where you can ramp up the clockspeed on some
cores if you hsut down other cores? that couls also benifit individual
threads.

David Lang

2010-06-06 08:20:24

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 12:52 AM, Vitaly Wool <[email protected]> wrote:
> 2010/6/5 Arve Hjønnevåg <[email protected]>:
>>
>> We clearly have different standards for what we consider good. We
>> measure time suspended in minutes or hours, not seconds, and waking up
>> every second or two causes a noticeable decrease in battery life on
>> the hardware we have today.
>
> Are you stating that the existing Android implementation enters the
> suspended state for hours for any of the existing designs?

It varies depending on device and usage. The battery monitoring on
NexusOne happens every ten minutes, so that's the longest you'll see a
N1 suspended for. On a G1 or Dream/myTouch you can see 20-30 minutes
between wakeups (depending on network issues and background data sync
traffic), and if you have background data sync off those devices can
sit in suspend for days at a time (unless you receive a phone call or
something). In "airplane mode", with no local alarms, a device can
easily sit in the lowest power state for a month or so, until the
battery finally runs out.

Brian

2010-06-06 08:24:54

by David Lang

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Thu, 3 Jun 2010, Linus Torvalds wrote:

> On Thu, 3 Jun 2010, Linus Torvalds wrote:
>>
>> so I'd like to see the opportunistc suspend thing think about CPU
>> offlining
>
> Side note: one reason for me being somewhat interested in the CPU
> offlining is that I think the Android kind of opportunistic suspend is
> _not_ likely something I'd like to see on a desktop. But an the
> "opportunistic CPU offliner"? That might _well_ be useful even outside of
> any other suspend activity.

When the OLPC was first released there was talk that the hardware was well
designed for sleeping (including the ability for the display to keep going
even if the system itself shut down), with the idealistic talk of the
system possibly sleeping between keystrokes.

things didn't end up working (a couple pieces of hardware ended up not
playing well with others), but the concept is still something that could
end up impacting users outside of the mobile phone market, even if not on
your traditional desktop.

David Lang

2010-06-06 08:32:18

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 10:20 AM, Brian Swetland <[email protected]> wrote:
> On Sun, Jun 6, 2010 at 12:52 AM, Vitaly Wool <[email protected]> wrote:
>> 2010/6/5 Arve Hj?nnev?g <[email protected]>:
>>>
>>> We clearly have different standards for what we consider good. We
>>> measure time suspended in minutes or hours, not seconds, and waking up
>>> every second or two causes a noticeable decrease in battery life on
>>> the hardware we have today.
>>
>> Are you stating that the existing Android implementation enters the
>> suspended state for hours for any of the existing designs?
>
> It varies depending on device and usage. ?The battery monitoring on
> NexusOne happens every ten minutes, so that's the longest you'll see a
> N1 suspended for. ?On a G1 or Dream/myTouch you can see 20-30 minutes
> between wakeups (depending on network issues and background data sync
> traffic), and if you have background data sync off those devices can
> sit in suspend for days at a time (unless you receive a phone call or
> something). ?In "airplane mode", with no local alarms, a device can
> easily sit in the lowest power state for a month or so, until the
> battery finally runs out.

That only concerns the case when you have just turned on the phone and
left it laying around.
You have to admit that it's not the common case for a smartphone. The
common case is that you've played with it for a bit, turning on things
like BT/WIFI, running some apps and so on. And doing so you'll end up
having wake locks taken from everywhere, so I can hardly see a second
of suspend for Nexus.

E. g. when the wireless is connected to an AP, it takes a wake lock
which is released on 15 minutes touchscreen inactivity timeout, as far
as I can tell. So:

* the system will never hit suspend during this period;
* if the download was ongoing and had not been completed during this
period, it will be terminated.

So the bottom line is: the approach is very inflexible. Of course it
can give you the best power savings if you turn the Airplane mode on
as soon as you switched on the phone, but this is not what a typical
user would do.

~Vitaly

2010-06-06 09:21:17

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 1:32 AM, Vitaly Wool <[email protected]> wrote:
>>
>> It varies depending on device and usage.  The battery monitoring on
>> NexusOne happens every ten minutes, so that's the longest you'll see a
>> N1 suspended for.  On a G1 or Dream/myTouch you can see 20-30 minutes
>> between wakeups (depending on network issues and background data sync
>> traffic), and if you have background data sync off those devices can
>> sit in suspend for days at a time (unless you receive a phone call or
>> something).  In "airplane mode", with no local alarms, a device can
>> easily sit in the lowest power state for a month or so, until the
>> battery finally runs out.
>
> That only concerns the case when you have just turned on the phone and
> left it laying around.
> You have to admit that it's not the common case for a smartphone. The
> common case is that you've played with it for a bit, turning on things
> like BT/WIFI, running some apps and so on. And doing so you'll end up
> having wake locks taken from everywhere, so I can hardly see a second
> of suspend for Nexus.

The common case for a phone is to be sitting around. Even for heavy
smartphone users, unless they power on, use the device screen-on for 4
hours solid or whatnot and drain the battery straight away, the device
is going to spend a significant portion of its operating time in
screen-off standby modes (conserving power for when you take a call,
browse the web, etc).

For typical users on typical android devices, this means the device
stays suspended for 5-10 minutes at a time, coming up for air when a
network packet (mail sync, im, etc) or alarm (battery monitor) wakes
the device briefly. Obviously with the right combination of bad apps
you will see a device suspending more rarely.

> E. g. when the wireless is connected to an AP, it takes a wake lock
> which is released on 15 minutes touchscreen inactivity timeout, as far
> as I can tell. So:
>
> * the system will never hit suspend during this period;
> * if the download was ongoing and had not been completed during this
> period, it will be terminated.

I'm pretty sure the wifi subsystem does not actually take a wakelock
while its connected -- it does have an alarm to spin down wifi after
15 minutes (by default, and user disableable) largely due to power
inefficiencies in the wifi solution in some early devices. There's
some room for improvement here, obviously. With a decent wifi chipset
and implementation, depending on local wifi traffic patterns, you can
see power usage competitive to cellular.

> So the bottom line is: the approach is very inflexible. Of course it
> can give you the best power savings if you turn the Airplane mode on
> as soon as you switched on the phone, but this is not what a typical
> user would do.

The savings in airplane mode (apart from preventing data connections,
which saves power by preventing data-hungry background apps from doing
much) is the difference between standby with radio (3-5mA) and without
(1-2mA). I'm not suggesting that airplane mode is a typical case,
just using it as in illustration of the more extreme standby case.

Users do like that to work too -- I recall Arve leaving a device in
his filing cabinet with the radio off while he was out of the country
for three weeks once, and him discovering it was still running with
something like 25% battery remaining when he returned.

In any case, I'm saying that suspending for minutes at a time
(typical, 10s of minutes or more in some cases, hours in others), does
happen and it does represent an improvement over suspending or
otherwise entering your lowest power state for seconds at a time.

Brian

2010-06-06 09:59:09

by David Lang

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, Brian Swetland wrote:

> The savings in airplane mode (apart from preventing data connections,
> which saves power by preventing data-hungry background apps from doing
> much) is the difference between standby with radio (3-5mA) and without
> (1-2mA). I'm not suggesting that airplane mode is a typical case,
> just using it as in illustration of the more extreme standby case.

for the sake of discussion, let's say that standby is 5ma and full
operation is 500ma and a minimal wakeup is 0.1 sec. these are probably
fairly pessimistic numbers.

waking up every second would be awake 10% of the time, so in an hour you
would use .9*5mA + .1*500mA = 4.5mA +45mA = 49.5mAH

waking up every 10 seconds would be awake 1% of the time, so in an hour
you would use .99*5mA + 0.01*500mA = 4.95mA + 5mA = 9.95mAH

waking up every 100 seconds would be awake 0.1% of the time, so in an hour
you would use .999*5mA + 0.001*500mA =4.995mA + 0.5mA = 5.495mAH

waking up every 1000 seconds would be awake 0.01% of the time so in an
hour you would use .9999*5mA + 0.0001*500mA = 4.9995mA + 0.05mAH =
5.0495mAH

now if you have a 1000mAH battery (small, but reasonable for a smartphone)
your standby life would be

.1 second wakeup (on continuously) = 2 hours
1 second wakup = 20 hours
10 second wakeup = 100 hours
100 second wakeup = 182 hours
1000 second wakeup = 198 hours

if you could shrink the time awake to 0.01 second per wakeup you would
shift this all up a category (and avoiding the need to wake everything up
to service a timer would help do this)

this effort very definantly has diminishing returns as you go to larger
sleep periods as the constant standby power draw becomes more and more
dominating. someone mentioned that they were getting the sleep time of
normal systems up past the 1 second mark with the 10 second mark looking
very attainable. that is where you get the most benifit for whatever
changes are needed. getting up to a 2 min sleep time really gives you
about all the benifit that you can get, going from there to 15 min makes
very little difference.

don't let chasing the best possible sleep time prevent you from
considering options that would be good enough in time, but would
drastically reduce the maintinance effort (as things could be upstreamed
more easily), and would be usable on far more systems.

David Lang

2010-06-06 10:00:50

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 11:21 AM, Brian Swetland <[email protected]> wrote:
>
> The common case for a phone is to be sitting around. ?Even for heavy
> smartphone users, unless they power on, use the device screen-on for 4
> hours solid or whatnot and drain the battery straight away, the device
> is going to spend a significant portion of its operating time in
> screen-off standby modes (conserving power for when you take a call,
> browse the web, etc).

Sure, but my point was, some non-trivial (still kind of natural for a
smartphone) activities with the device will prevent it from suspending
for quite some time. Even worse, the suspend wakelock will keep the
whole kernel active, as opposed to powering off unused devices
separately as it's done in runtime PM. Yep, I know about the "early
suspend" type of thing; yet it's excess, not mainlined and lacks
granularity.

> For typical users on typical android devices, this means the device
> stays suspended for 5-10 minutes at a time, coming up for air when a
> network packet (mail sync, im, etc) or alarm (battery monitor) wakes
> the device briefly. ?Obviously with the right combination of bad apps
> you will see a device suspending more rarely.

Wasn't that you who stated that you so successfully tolerate bad apps
with opportunistic suspend that anything of the kind should not really
be the case? :)

>> E. g. when the wireless is connected to an AP, it takes a wake lock
>> which is released on 15 minutes touchscreen inactivity timeout, as far
>> as I can tell. So:
>>
>> * the system will never hit suspend during this period;
>> * if the download was ongoing and had not been completed during this
>> period, it will be terminated.
>
> I'm pretty sure the wifi subsystem does not actually take a wakelock
> while its connected -- it does have an alarm to spin down wifi after
> 15 minutes (by default, and user disableable) largely due to power
> inefficiencies in the wifi solution in some early devices.

Oh? How does it make sure it's not powered off while scanning for APs,
for instance?

> Users do like that to work too -- I recall Arve leaving a device in
> his filing cabinet with the radio off while he was out of the country
> for three weeks once, and him discovering it was still running with
> something like 25% battery remaining when he returned.

So what you're actually up to is that a user should restart the phone
and turn the radio off if he wants to find it running when he's back
from a long business trip or something. Nice...

> In any case, I'm saying that suspending for minutes at a time
> (typical, 10s of minutes or more in some cases, hours in others), does
> happen and it does represent an improvement over suspending or
> otherwise entering your lowest power state for seconds at a time.

That's for sure, if _all_ the other parameters *are* *equal*. This is
obviously not the case.

~Vitaly

2010-06-06 10:02:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> That is too simple. You also have to prevent A from being frozen while
> >> it is processing the event or the result would be the same as if it
> >> was frozen beforehand.
> >
> > The framework decides when to freeze the app in the first place (as
> > your framework does now when it decides to suspend)
> >
> > ? ? So it knows whether the app is frozen or not.
> >
> > ? ? So it knows damend well whether it processed the event or not.
> >
>
> Our user-space code is not single-threaded. So just because an app was
> not frozen when you checked does not mean it will remain unfrozen. We
> can use the same user-space wakelock api we have now to prevent
> freezing apps instead of preventing suspend, but we loose any
> advantage we get from freezing just a subset of processes this way.

Errm. That does not matter whether its single threaded or not. And
right, you have to prevent that it gets frozen while you are calling
into it.

But that does not change the fact that you can do finer grained power
control even in the case when suspend is impossible because a
background application has work to finish and does that without
requiring interaction with the frozen part.

That's what I pointed out in the first place and you just argue in
circles why it is impossible to do so.

Let me recapitulate:

Full on state: No difference because everything runs
Full suspend state: No difference because everything is down

Screen off, background work active:

Suspend blocker held by the active background work lets
other applications which are unrelated consume CPU cycles
and power.

versus

Frozen apps restrict the CPU cycles and power consumption to
the background work (if there are no interactions with
frozen tasks) and therefor save more power than the on/off
approach.

If your user space stack cannot be distangled that way, then
it's a problem of your user space stack and not changing the
fact, that a well designedd system allows you to do that.

Any objections ?

Thanks,

tglx






2010-06-06 10:13:50

by David Lang

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, Vitaly Wool wrote:

> On Sun, Jun 6, 2010 at 11:21 AM, Brian Swetland <[email protected]> wrote:
>> In any case, I'm saying that suspending for minutes at a time
>> (typical, 10s of minutes or more in some cases, hours in others), does
>> happen and it does represent an improvement over suspending or
>> otherwise entering your lowest power state for seconds at a time.
>
> That's for sure, if _all_ the other parameters *are* *equal*. This is
> obviously not the case.

and while it will represent an improvement, is the cost worth the
relativly minor benifit that going from 10s of seconds of sleep to 10s of
minuites of sleep give you?

a system that wakes up every 10 seconds, but only wakes the portion of the
system needed for the wakeup can easily outlast one that wakes up far less
frequently, but when it's awake is fully awake.

as an example (taken from this thread).

system A needs to wake up to get a battery reading, store it and go
back to sleep, It does so every 10 seconds. But when it does so it only
runs the one process and then goes back to sleep.

system B has the same need, but wakes up every 10 minutes. but when it
does so it fully wakes up and this allows the mail app to power up the
radio, connect to the Internet and start checking for new mail before
oppurtunistic sleep shuts things down (causing the mail check to fail)

System A will last considerably longer on a battery than System B.

David Lang

2010-06-06 10:19:13

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 <[email protected]>:

> as an example (taken from this thread).
>
> system A needs to wake up to get a battery reading, store it and go back to
> sleep, It does so every 10 seconds. But when it does so it only runs the one
> process and then goes back to sleep.
>
> system B has the same need, but wakes up every 10 minutes. but when it does
> so it fully wakes up and this allows the mail app to power up the radio,
> connect to the Internet and start checking for new mail before oppurtunistic
> sleep shuts things down (causing the mail check to fail)
>
> System A will last considerably longer on a battery than System B.

Exactly, thanks for pointing out the specific example :)

~Vitaly

2010-06-06 10:37:05

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/5 Thomas Gleixner <[email protected]>:
> >> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> >> >> >> > That download might take a minute or two, but that's not an
> >> >> >> > justification for the crapplication to run unconfined and prevent
> >> >> >> > lower power states.
> >> >> >> >
> >> >> >>
> >> >> >> I agree, but this is not a simple problem to solve.
> >> >> >
> >> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
> >> >> > straight forward.
> >> >> >
> >> >>
> >> >> I don't think is is straight forward. If the a process in the frozen
> >> >> group holds a resource that a process in the unfrozen group needs, how
> >> >> do deal with that?
> >> >
> >> > I'm going to fix the framework which puts the group into freeze state
> >> > w/o making sure that there is no held shared resource. Come on it's
> >> > not rocket science.
> >> >
> >>
> >> I'm not sure which framework you are talking about here, but I don't
> >> think there is a single framework that knows about all shared
> >> resources.
> >
> > Damn, it's not me talking about "our framework", you are mentioning
> > when it fits your needs.
>
> You said you were going to fix the framework. I did know if you were
> talking about the cgroup framework, or the android user-space
> frameworks. I don't think either has knowledge about all shared
> resources.

The cgroup freezer makes sure that there are no in kernel resources
blocked. Of course the user space side has to do the same and it's not
rocket science.

> >
> > If you do not have a clearly defined user space framework, then we
> > talk about a completely random conglomeration of applications which
> > need to be brought into submission by some global brute force
> > approach.
> >
> > I'm tired of this, really. You just use terminlology as it fits to
> > defend the complete design failure of android. But you fail to trick
> > me :)
> >
> > Can you please explain in a consistent way how the application stack
> > and the underlying framework (which exists according to android docs)
> > is handling events and how the separation of trust level works ?
> >
>
> I don't think I can, since I only know small parts of it. I know some

Sigh. That's the main reason why this discussion goes nowhere.

How in heavens sake can we make a decision whether suspend blockers
are the right and only way to go, when the people

> events like input event go though a single thread in our system
> process, while other events like network packets (which are also
> wakeup events) goes directly to the app.

2010-06-06 10:46:21

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 12:00:47 +0200
Vitaly Wool <[email protected]> wrote:

> Even worse, the suspend wakelock will keep the
> whole kernel active, as opposed to powering off unused devices
> separately as it's done in runtime PM.

That is not true. While the kernel is not suspended it does
runtime pm.



> > Users do like that to work too -- I recall Arve leaving a device in
> > his filing cabinet with the radio off while he was out of the country
> > for three weeks once, and him discovering it was still running with
> > something like 25% battery remaining when he returned.
>
> So what you're actually up to is that a user should restart the phone
> and turn the radio off if he wants to find it running when he's back
> from a long business trip or something. Nice...

?


Cheers,
Flo

2010-06-06 10:50:07

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 12:19:08 +0200
Vitaly Wool <[email protected]> wrote:

> 2010/6/6 <[email protected]>:
>
> > as an example (taken from this thread).
> >
> > system A needs to wake up to get a battery reading, store it and go back to
> > sleep, It does so every 10 seconds. But when it does so it only runs the one
> > process and then goes back to sleep.
> >
> > system B has the same need, but wakes up every 10 minutes. but when it does
> > so it fully wakes up and this allows the mail app to power up the radio,
> > connect to the Internet and start checking for new mail before oppurtunistic
> > sleep shuts things down (causing the mail check to fail)
> >
> > System A will last considerably longer on a battery than System B.
>
> Exactly, thanks for pointing out the specific example :)
>
> ~Vitaly

This does not affect suspend_blockers nor does suspend_blockers
interfere with that.

Suspend_blockers allow the system to suspend ("mem">/sys/power/state
suspend), when the userspace decides that the device is not in use.

So implementing suspend_blockers support does not impact any
optimizations done to either system A nor system B.

Cheers,
Flo

2010-06-06 10:57:42

by Thomas Gleixner

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Thomas Gleixner <[email protected]>:
> >
> > Can you please explain in a consistent way how the application stack
> > and the underlying framework (which exists according to android docs)
> > is handling events and how the separation of trust level works ?
> >
>
> I don't think I can, since I only know small parts of it. I know some

Sigh, thats the whole reason why this discussion goes nowhere.

How in heavens sake should we be able to decide whether suspend
blockers are the right and only thing which solves a problem, when the
folks advocating suspend blockers are not able to explain the problem
in the first place ?

> events like input event go though a single thread in our system
> process, while other events like network packets (which are also
> wakeup events) goes directly to the app.

Yes, we know that already, but that's a completely useless information
as it does not describe the full constraints and dependencies.

Lemme summarize:

Android needs suspend blockers, because it works, but cannot explain
why it works and why it only works that way.

A brilliant argument to merge them - NOT.

Thanks,

tglx

2010-06-06 10:58:28

by Alan

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 12:46:01 +0200
Florian Mickler <[email protected]> wrote:

> On Sun, 6 Jun 2010 12:00:47 +0200
> Vitaly Wool <[email protected]> wrote:
>
> > Even worse, the suspend wakelock will keep the
> > whole kernel active, as opposed to powering off unused devices
> > separately as it's done in runtime PM.
>
> That is not true. While the kernel is not suspended it does
> runtime pm.

On several of our platforms runtime PM already includes suspend so a
suspend wakelock does interfere with existing power managemet at that
level (not to mention the maintenance mess it causes).

This is one of the reasons you want QoS information, it provides
parameters by which the power management code can make a decision.
Suspend blocksers simply don't have sufficient variety to manage the
direction of power policy.

If Android chooses to abuse the QoS information for crude suspend
blocking then that is fine, it doesn't interfere with doing the job
'properly' on other systems or its use for realtime work on other boxes.

Alan

2010-06-06 11:04:29

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Florian Mickler <[email protected]>:

> Suspend_blockers allow the system to suspend ("mem">/sys/power/state
> suspend), when the userspace decides that the device is not in use.

Sorry. What? Blockers allow the system to suspend?

> So implementing suspend_blockers support does not impact any
> optimizations done to either system A nor system B.

Suspend blockers by themselves are of no use. Completely. So any talks
on suspend blockers separated from the sleep policy are completely
pointless.
The suspend blockers are of use when the userspace tries to blindly
freeze the tasks to enter the suspend state. This way of hammering the
system down obviously impacts everything.

~Vitaly

2010-06-06 11:09:58

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sat, 5 Jun 2010, Alan Stern wrote:

> > If you are referring to the approach that we don't use suspend but
> > freeze a cgroup instead, this only solves the problem of bad apps. It
> > does not help pause timers in trusted user space code and in the
> > kernel, so it does not lower our average power consumption.
>
> You can solve this problem if you restructure your "trusted" apps in
> the right way. Require a trusted app to guarantee that whenever it
> doesn't hold any suspend blockers, it will do nothing but wait (in a
> poll() system call for example) for a wakeup event. When the event
> occurs, it must then activate a suspend blocker.
>
> Better yet, make it more fine-grained. Instead of trusted apps, have
> trusted threads. Freeze the untrusted threads along with everything
> else, and require the trusted threads to satisfy this guarantee.
>
> In this way, while the system is idle no user timers will get renewed.
> Kernel timers are another matter, but we should be able to handle them.
> There's nothing Android-specific about wanting to reduce kernel timer
> wakeups while in a low-power mode.

In fact it's possible to do this with only minimal changes to the
userspace, providing you can specify all your possible hardware wakeup
sources. (On the Android this list probably isn't very large -- I
imagine it includes the keypad, the radio link(s), the RTC, and maybe
a few switches, buttons, or other things.)

Here's how you can do it. Extend the userspace suspend-blocker API, so
that each suspend blocker can optionally have an associated wakeup
source.

The power-manager process should keep a list of "active" wakeup
sources. A source gets removed from the list when an associated
suspend blocker is activated.

When the "active" list is empty and no suspend blockers are activated,
the power manager freezes ALL other processes, trusted and untrusted
alike. It then does a big poll() on all the wakeup sources. When the
poll() returns, its output is used to repopulate the "active" list and
processes are unfrozen.

(You can also include some error detection: If a source remains on the
"active" list for too long then something has gone wrong.)

To do all this you don't even need to use cgroups. The existing PM
implementation allows a user process to freeze everything but itself;
that's how swsusp and related programs work.

This is still a big-hammer sort of approach, but it doesn't require any
kernel changes.

Alan Stern

2010-06-06 11:11:54

by Felipe Contreras

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 <[email protected]>:
> On Sun, 6 Jun 2010, Brian Swetland wrote:
> if you could shrink the time awake to 0.01 second per wakeup you would shift
> this all up a category (and avoiding the need to wake everything up to
> service a timer would help do this)
>
> this effort very definantly has diminishing returns as you go to larger
> sleep periods as the constant standby power draw becomes more and more
> dominating. someone mentioned that they were getting the sleep time of
> normal systems up past the 1 second mark with the 10 second mark looking
> very attainable. that is where you get the most benifit for whatever changes
> are needed. getting up to a 2 min sleep time really gives you about all the
> benifit that you can get, going from there to 15 min makes very little
> difference.
>
> don't let chasing the best possible sleep time prevent you from considering
> options that would be good enough in time, but would drastically reduce the
> maintinance effort (as things could be upstreamed more easily), and would be
> usable on far more systems.

Not to mention the fact that there's nothing fundamental that prevents
dynamic PM to reach > 15 min idle. It's a matter of time before we
find the tools needed. The amount of work that suspend blockers would
require to implement properly in user-space other than Android just
doesn't match the power savings.

--
Felipe Contreras

2010-06-06 11:15:59

by David Lang

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, Florian Mickler wrote:

> On Sun, 6 Jun 2010 12:19:08 +0200
> Vitaly Wool <[email protected]> wrote:
>
>> 2010/6/6 <[email protected]>:
>>
>>> as an example (taken from this thread).
>>>
>>> system A needs to wake up to get a battery reading, store it and go back to
>>> sleep, It does so every 10 seconds. But when it does so it only runs the one
>>> process and then goes back to sleep.
>>>
>>> system B has the same need, but wakes up every 10 minutes. but when it does
>>> so it fully wakes up and this allows the mail app to power up the radio,
>>> connect to the Internet and start checking for new mail before oppurtunistic
>>> sleep shuts things down (causing the mail check to fail)
>>>
>>> System A will last considerably longer on a battery than System B.
>>
>> Exactly, thanks for pointing out the specific example :)
>>
>> ~Vitaly
>
> This does not affect suspend_blockers nor does suspend_blockers
> interfere with that.
>
> Suspend_blockers allow the system to suspend ("mem">/sys/power/state
> suspend), when the userspace decides that the device is not in use.
>
> So implementing suspend_blockers support does not impact any
> optimizations done to either system A nor system B.

Actually, it does.

system A is what's being proposed by kernel developers, where the
untrusted stuff is in a different cgroup and what puts the system to sleep
is 'normal' power management. It doesn't sleep as long, but when it wakes
up the untrusted stuff is still frozen, so it doesn't stay awake long, or
do very much.

System B is suspend blockers where you are either awake or asleep, and
when you wake up you wake up fully, but oppertunistic sleep can interrupt
untrusted processes at any time. The system sleeps longer (as fewer things
can wake it), but when it wakes up it's fully awake.

David Lang

2010-06-06 11:18:41

by Felipe Contreras

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/6 Arjan van de Ven <[email protected]>:
> On Sat, 5 Jun 2010 14:26:14 -0700
> Arve Hjønnevåg <[email protected]> wrote:
>> > the kernel has a set of infrastructure already to help here (range
>> > timers, with which you can wakeup-limit untrusted userspace crap),
>> > timer slack for legacy background timers, etc etc.
>>
>> Range timers allows the kernel to align different timers so they don't
>> each bring the cpu out of idle individually. They do not eliminate
>> timers or make individual timers fire less often.
>
> you're incorrect.
> With range timers you can control the rate at which timers fire just
> fine.

I was wondering... Currently GLib user-space aligns itself to fire
burst of work at second boundaries without the need for IPC. But if
you want to align beyond one second you need multi-process alignment.
Say, one application says: wake me up between 30s and 1m. And the
other one says: wake me up between 10m and 20m. They could very well
align at some point if there was a central process keeping track of
all the timers.

Does the kernel provide something to solve that problem already?

--
Felipe Contreras

2010-06-06 11:27:36

by David Lang

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sun, 6 Jun 2010, Felipe Contreras wrote:

> 2010/6/6 Arjan van de Ven <[email protected]>:
>> On Sat, 5 Jun 2010 14:26:14 -0700
>> Arve Hj?nnev?g <[email protected]> wrote:
>>>> the kernel has a set of infrastructure already to help here (range
>>>> timers, with which you can wakeup-limit untrusted userspace crap),
>>>> timer slack for legacy background timers, etc etc.
>>>
>>> Range timers allows the kernel to align different timers so they don't
>>> each bring the cpu out of idle individually. They do not eliminate
>>> timers or make individual timers fire less often.
>>
>> you're incorrect.
>> With range timers you can control the rate at which timers fire just
>> fine.
>
> I was wondering... Currently GLib user-space aligns itself to fire
> burst of work at second boundaries without the need for IPC. But if
> you want to align beyond one second you need multi-process alignment.
> Say, one application says: wake me up between 30s and 1m. And the
> other one says: wake me up between 10m and 20m. They could very well
> align at some point if there was a central process keeping track of
> all the timers.
>
> Does the kernel provide something to solve that problem already?

yes,the kernel will let you say 'wake me in 10m, with a possible delay of
up to 10 min' and get woken up between 10 and 20 min

this is what was mentioned earlier in the thread where you could take an
app that tries to do something every .1 second and set the kernel to allow
the tick to be delayed for up to 3.9 seconds, resulting in a wakeup every
4 seconds if the system is idle. This has the additional advantage that if
the system is not idle and other things are causing ticks to take place
anyway, this app would get more ticks up to the 10/second it's asking for)

David Lang

2010-06-06 13:28:39

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Rafael J. Wysocki <[email protected]>:
> > On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/5 Rafael J. Wysocki <[email protected]>:
> >> > On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
> >> >> 2010/6/4 Matt Helsley <[email protected]>:
> >> >> > On Fri, Jun 04, 2010 at 05:39:17PM -0700, Arve Hj?nnev?g wrote:
> >> >> >> On Fri, Jun 4, 2010 at 5:05 PM, Thomas Gleixner <[email protected]> wrote:
> >> >> >> > On Sat, 5 Jun 2010, Rafael J. Wysocki wrote:
> >> >> >
> >> >> > <snip>
> >> >> >
> >> >> >>
> >> >> >> > With the cgroup freezer you can "suspend" them right away and
> >> >> >> > just keep the trusted background task(s) alive which allows us to
> >> >> >> > go into deeper idle states instead of letting the crapplications
> >> >> >> > run unconfined until the download finished and the suspend
> >> >> >> > blocker goes away.
> >> >> >> >
> >> >> >>
> >> >> >> Yes this would be better, but I want it in addition to suspend, not
> >> >> >> instead of it. It is also unclear if our user-space code could easily
> >> >> >> make use of it since our trusted code calls into untrusted code.
> >> >> >>
> >> >> >
> >> >> > Perhaps I'm misunderstanding, but suspend and the cgroup freezer
> >> >> > interoperate well today -- you don't have to choose one or the other.
> >> >> > If you've discovered otherwise I'd consider it a bug and would like to
> >> >> > hear more about it.
> >> >> >
> >> >>
> >> >> I'm not aware of any bug with combining both, but we cannot use
> >> >> suspend at all without suspend blockers in the kernel (since wakeup
> >> >> events may be ignored)
> >> >
> >> > The more I think of it, the more it appears to me that the problem of
> >> > lost wakeup events can actually be solved without suspend blockers.
> >> > I'll send a bunch of patches to address this issue, probably tomorrow.
> >> >
> >>
> >> I know of two ways to prevent lost wakeup events. Reset a timeout
> >> every time you receive a wakeup event or prevents suspend until you
> >> know the event has been fully processed. Does your solution fall onto
> >> one of these two categories, or do you have a third way?
> >
> > Basically, it involves two mechanisms, detection of wakeup events occuring
> > right before suspend is started
>
> This sounds like the timeout approach which I thought you did not like.
>
> > and aborting suspend if wakeup events occur
> > in the middle of it.
> >
> Aborting suspend is easy, but when do you allow suspend again?

I would recommend you to wait for the patches and then comment. :-)

> >> >> and I don't know how we can safely freeze
> >> >> cgroups without funneling all potential wakeup events through a
> >> >> process that never gets frozen.
> >> >
> >> > If your untrusted apps get called by the trusted ones, they aren't really
> >> > untrusted in the first place.
> >> >
> >> That is not a correct statement. A trusted apps can call into an
> >> untrusted app, it just has to validate the response and handle not
> >> getting a response at all. There are also different levels of trust. I
> >> may have trusted an app to provide a contact pictures, but not trusted
> >> it to block suspend. When the phone rings the app will be called to
> >> provide the picture for the incoming call dialog, but if it is frozen
> >> at this point the more trusted app that handles the incoming phone
> >> call will not be able to get the picture.
> >
> > It will be able to do that if it causes the frozen part of user space to be
> > thawed.
> >
> > I think you have this problem already, though, because you use full system
> > suspend and all of your apps are frozen by it. So, to handle the situation you
> > describe above, you need to carry out full system resume that will thaw the
> > tasks for you. I don't see any fundamental difference betwee the two cases.
> >
>
> Yes, we can keep all our user space suspend blockers and thaw the
> frozen cgroup when any suspend blocker is held, but this would
> eliminate any power advantage that freezing a cgroup has over using
> suspend to freeze all processes.

Why does it have to have _any_ power advantage? It's totally sufficient if it
gives you approximately the same power savings. IOW, it doesn't have to be
_better_ if it's acceptable to the kernel people at large.

The rest of your objections have been addressed by Alan, so I'm not going to
repeat his arguments.

Rafael

2010-06-06 13:32:11

by Matthew Garrett

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:00:47PM +0200, Vitaly Wool wrote:

> Sure, but my point was, some non-trivial (still kind of natural for a
> smartphone) activities with the device will prevent it from suspending
> for quite some time. Even worse, the suspend wakelock will keep the
> whole kernel active, as opposed to powering off unused devices
> separately as it's done in runtime PM. Yep, I know about the "early
> suspend" type of thing; yet it's excess, not mainlined and lacks
> granularity.

Holding a suspend blocker is entirely orthogonal to runtime pm. The
"whole kernel" will not be "active" - it can continue to hit the same
low power state in the idle loop, and any runtime pm implementation in
the drivers will continue to be active.

--
Matthew Garrett | [email protected]

2010-06-06 13:34:50

by Matthew Garrett

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:05:57PM +0100, Alan Cox wrote:
> On Sun, 6 Jun 2010 12:46:01 +0200
> Florian Mickler <[email protected]> wrote:
> > That is not true. While the kernel is not suspended it does
> > runtime pm.
>
> On several of our platforms runtime PM already includes suspend so a
> suspend wakelock does interfere with existing power managemet at that
> level (not to mention the maintenance mess it causes).

No, it doesn't. Android on omap will enter the mpu/core off state from
the idle loop even if a suspend block is held.

--
Matthew Garrett | [email protected]

2010-06-06 13:54:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/5 Rafael J. Wysocki <[email protected]>:
> > On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
> >> 2010/6/5 Thomas Gleixner <[email protected]>:
> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
...
> >
> > Arve, we're still learning you have some more requirements we had no idea
>
> What new requirement are you talking about. Did you assume all our
> user-space ipc calls went though a single process?

No, but I didn't assume that your wakelock-holding processes depend on the
other processes in a way that might prevent them from acquiring or dropping
a wakelock.

...
> >> >> Trusted code that calls into untrusted code has to deal with the
> >> >> untrusted code not responding, but we only want to pop up a message
> >> >> that the application is not responding if it is misbehaving, not just
> >> >> because it was frozen though no fault of its own.
> >
> > When Android starts opportunistic suspend, all applications are frozen,
> > "trusted" as well as "untrusted", right? So, after they are all frozen, none
> > of them can do anything to prevent suspend from happening, right?
>
> Not if you mean when we write to /sys/power/state. Processes are not
> frozen until the last suspend blocker is released.

That doesn't matter. In the opportunistic mode you don't need to write into
/sys/power/state to start suspend, this is done by the kernel automatically as
soon as the last wakelock has been released (at least this is my assumption
about how this works). Now, at this point the processes that don't use
wakelocks can't really prevent themselves from being frozen and only the
wakelocks users can do that. So, if a wakelock user depends on a process
that doesn't use wakelocks in such a way that (because of that dependence) it
can't acquire its wakelock while processes are being frozen, things don't work
as they are supposed to.

> > Now, in my proposed approach the "untrusted" apps are frozen exactly at the
> > point Android would start opportunistic suspend and they wouldn't be able
> > to do anything about that anyway. So if one of your "trusted" apps depends
> > on the "untrusted" ones in a way that you describe, you alread have a bug
> > (the "trusted" app cannot prevent automatic suspend from happening even if it
> > wants, because it depends on an "untrusted" app that has just been frozen).
> >
>
> I don't think what you said here is correct. If a wakeup event happens
> all processed are unfrozen since the driver blocks suspend.

This only means that the theoretical failure you gave as an example doesn't
happen in practice. No problem, then. :-)

> The app that reads this event blocks suspend before reading it. If it was
> busy talking to a less trusted app when the event happened it still works
> since all apps are running at this point.

And how is this different from an approach with cgroup freezing? Apps that
use wakelock within the current framework would use "freeze locks" to prevent
the "untrusted" part of user space from being frozen or to thaw it. Where's
the problem, then?

Rafael

2010-06-06 14:32:06

by James Bottomley

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 2010-06-06 at 12:05 +0100, Alan Cox wrote:
> On Sun, 6 Jun 2010 12:46:01 +0200
> Florian Mickler <[email protected]> wrote:
>
> > On Sun, 6 Jun 2010 12:00:47 +0200
> > Vitaly Wool <[email protected]> wrote:
> >
> > > Even worse, the suspend wakelock will keep the
> > > whole kernel active, as opposed to powering off unused devices
> > > separately as it's done in runtime PM.
> >
> > That is not true. While the kernel is not suspended it does
> > runtime pm.
>
> On several of our platforms runtime PM already includes suspend so a
> suspend wakelock does interfere with existing power managemet at that
> level (not to mention the maintenance mess it causes).
>
> This is one of the reasons you want QoS information, it provides
> parameters by which the power management code can make a decision.
> Suspend blocksers simply don't have sufficient variety to manage the
> direction of power policy.
>
> If Android chooses to abuse the QoS information for crude suspend
> blocking then that is fine, it doesn't interfere with doing the job
> 'properly' on other systems or its use for realtime work on other boxes.

Right ... and I think we can make use of this as an incremental way
forwards. This QoS re-expression needs doing for the suspend from idle
+ cgroup approach, and it can be made to work with the current suspend
blockers patch.

I've already posted most of the necessary improvements to pm_qos, all of
which end up looking like the right thing to do independent of android.
There's really only one remaining thing, and that's adding statistics.

Once stats are added, I think I can transform the 8 android patches into
a set of 7 pm_qos transformations and one patch that adds the
opportunistic suspend infrastructure. The 7 pm_qos patches should be
reasonably uncontroversial, but what they would allow us to do is to
unblock about 75% of the driver divergences from Qualcomm and others.
The 1 opportunistic suspend one will be confined to one or two files, so
is easy to maintain ... we can then argue over who should maintain it in
the interim, us or Google.

>From this basis, we can then proceed to look at implementing the cgroups
+ suspend from idle approach, and we can do this regardless of whether
the opportunistic suspend patch is applied or not.

There are three reasons why the whole debate is going in circles

1. Lots of people are taking a holistic approach (i.e. must solve
everything) ... this means that previously unarticulated issues
keep cropping up that are unrelated to the current patch set ...
but which set off another cascade of emails.
2. There currently is no cgroups + suspend from idle approach
implemented anywhere. That means we have to argue theoreticals
rather than actuals (theoreticals are easy to shoot down with
other theoretical arguments ... leading to another email
cascade). If we implemented the thing, these arguments would
compare one factual basis to another.
3. We've lost sight of one of the original goals, which was to
bring the android tree close enough to the kernel so that the
android downstream driver and board producers don't have to
choose between the android kernel and vanilla kernel.

I think the proposal above gets us to within 75% of the way to 3, moves
us towards a factual basis for 2 and eliminates some of the grounds for
argument of 1 ... now can we please get on with it?

James

2010-06-06 14:43:30

by Matt Helsley

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:36:21PM +0200, Thomas Gleixner wrote:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:

<snip>

> > events like input event go though a single thread in our system
> > process, while other events like network packets (which are also
> > wakeup events) goes directly to the app.

If you want to wake up cgroup-frozen tasks for these fds perhaps your
framework can fcnt(fd, F_SETOWN, <p[g]id>) to send SIGIO to a
userspace-suspend-blocker thread/process/process group. When IO comes in, the
suspend blocker is signalled which then unfreezes the cgroup of the frozen
untrusted task. SIGIO works on pipes, fifos, sockets, ttys, and ptys --
many of which are precisely the kinds of things that would connect [trusted
and untrusted] apps. Notably absent (last I checked): inotify fds, signalfd,
timerfd, eventfd, filesystem fds and likely more.

Incidentally, this is just to show that it's not impossible to implement
"wakeups" for cgroup-frozen tasks in userspace.

Cheers,
-Matt Helsley

2010-06-06 15:26:12

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Matthew Garrett <[email protected]>:
> On Sun, Jun 06, 2010 at 12:00:47PM +0200, Vitaly Wool wrote:
>
>> Sure, but my point was, some non-trivial (still kind of natural for a
>> smartphone) activities with the device will prevent it from suspending
>> for quite some time. Even worse, the suspend wakelock will keep the
>> whole kernel active, as opposed to powering off unused devices
>> separately as it's done in runtime PM. Yep, I know about the "early
>> suspend" type of thing; yet it's excess, not mainlined and lacks
>> granularity.
>
> Holding a suspend blocker is entirely orthogonal to runtime pm. The
> "whole kernel" will not be "active" - it can continue to hit the same
> low power state in the idle loop, and any runtime pm implementation in
> the drivers will continue to be active.

Yeah, that might also be the case, But then again, what's the use of
suspend blockers in this situation?

~Vitaly

2010-06-06 15:30:34

by Matthew Garrett

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 05:26:09PM +0200, Vitaly Wool wrote:
> 2010/6/6 Matthew Garrett <[email protected]>:
> > Holding a suspend blocker is entirely orthogonal to runtime pm. The
> > "whole kernel" will not be "active" - it can continue to hit the same
> > low power state in the idle loop, and any runtime pm implementation in
> > the drivers will continue to be active.
>
> Yeah, that might also be the case, But then again, what's the use of
> suspend blockers in this situation?

The difference between idle-based suspend and opportunistic suspend is
that the former will continue to wake up for timers and will never be
entered if something is using CPU, whereas the latter will be entered
whenever no suspend blocks are held. The problem with opportunistic
suspend is that you might make the decision to suspend simultaneusly
with a wakeup event being received. Suspend blocks facilitate
synchronisation between the kernel and userspace to ensure that all such
events have been consumed and handld appropriately.

--
Matthew Garrett | [email protected]

2010-06-06 15:46:59

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, James Bottomley wrote:
>
> 3. We've lost sight of one of the original goals, which was to
> bring the android tree close enough to the kernel so that the
> android downstream driver and board producers don't have to
> choose between the android kernel and vanilla kernel.

There are two ways to do that w/o creating a dependcy on anything.

1) merge the drivers w/o the suspend_blockers. It's not rocket science
to have a patch which brings them back for android.

2) merge the drivers with empty stub implementations for annotation.
android just has to patch in the real one.

While I'd prefer #1, I' not in the way of #2.

Both ways can get the drivers into the kernel and it could/should have
been done right from the beginning, but now we face a situation where
drivers are held hostage.

Then we can sit down more relaxed and fix the stuff in a way which
makes both sides happy. If we manage to replace them, we can deprecate
the stub implementation and remove it after a grace period. If we
rename them it's not an issue either. We can rename them right away to
a qos interface, but that does not really make a difference.

What we really want to avoid is implementing an user space contract in
a frenzy which binds us forever.

It's not the suspend_blockers which are the causing the nightmare,
it's solely the drivers itself especially when there are different
implementations in both trees. And frankly, the drivers in android are
not in a shape which makes them flood in within 2 weeks. That's
serious work to get them brushed up and polished. So that gives us
quite a period of time to solve the suspend problem.

Thanks,

tglx

2010-06-06 15:47:17

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Matthew Garrett <[email protected]>:
> On Sun, Jun 06, 2010 at 05:26:09PM +0200, Vitaly Wool wrote:
>> 2010/6/6 Matthew Garrett <[email protected]>:
>> > Holding a suspend blocker is entirely orthogonal to runtime pm. The
>> > "whole kernel" will not be "active" - it can continue to hit the same
>> > low power state in the idle loop, and any runtime pm implementation in
>> > the drivers will continue to be active.
>>
>> Yeah, that might also be the case, But then again, what's the use of
>> suspend blockers in this situation?
>
> The difference between idle-based suspend and opportunistic suspend is
> that the former will continue to wake up for timers and will never be
> entered if something is using CPU, whereas the latter will be entered
> whenever no suspend blocks are held. The problem with opportunistic
> suspend is that you might make the decision to suspend simultaneusly
> with a wakeup event being received. Suspend blocks facilitate
> synchronisation between the kernel and userspace to ensure that all such
> events have been consumed and handld appropriately.

Right, and then you start taking suspend blockers in kernel here and
there which eventually interferes with runtime PM.
So I don't buy the "orthogonality" point. Generally speaking, it's not true.

~Vitaly

2010-06-06 16:43:58

by Matthew Garrett

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 05:47:10PM +0200, Vitaly Wool wrote:
> 2010/6/6 Matthew Garrett <[email protected]>:
> > The difference between idle-based suspend and opportunistic suspend is
> > that the former will continue to wake up for timers and will never be
> > entered if something is using CPU, whereas the latter will be entered
> > whenever no suspend blocks are held. The problem with opportunistic
> > suspend is that you might make the decision to suspend simultaneusly
> > with a wakeup event being received. Suspend blocks facilitate
> > synchronisation between the kernel and userspace to ensure that all such
> > events have been consumed and handld appropriately.
>
> Right, and then you start taking suspend blockers in kernel here and
> there which eventually interferes with runtime PM.

Suspend blocks prevent system suspend, not any per-device suspend.

--
Matthew Garrett | [email protected]

2010-06-06 17:08:46

by James Bottomley

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 2010-06-06 at 17:46 +0200, Thomas Gleixner wrote:
> On Sun, 6 Jun 2010, James Bottomley wrote:
> >
> > 3. We've lost sight of one of the original goals, which was to
> > bring the android tree close enough to the kernel so that the
> > android downstream driver and board producers don't have to
> > choose between the android kernel and vanilla kernel.
>
> There are two ways to do that w/o creating a dependcy on anything.
>
> 1) merge the drivers w/o the suspend_blockers. It's not rocket science
> to have a patch which brings them back for android.

Well, we sort of tried this when Greg pulled some of them into the
staging tree. The problem is that without the annotations, the drivers
are still different, and patches won't apply, so, unsurprisingly, they
didn't get improved or even maintained.

> 2) merge the drivers with empty stub implementations for annotation.
> android just has to patch in the real one.

That's also possible. This time, we would have a cosmetically closer
tree ... however, what's in the kernel wouldn't be compilable for
android ... which is where all the downstream wants to test, so they'd
still be building for the android tree ... we just might have an easier
time of it picking up their fixes.

> While I'd prefer #1, I' not in the way of #2.

I think 1 is unviable ... I'm not opposed to 2 but I'd like to try to
get the kernel really closer to android before we go for the cosmetic
only option.

> Both ways can get the drivers into the kernel and it could/should have
> been done right from the beginning, but now we face a situation where
> drivers are held hostage.
>
> Then we can sit down more relaxed and fix the stuff in a way which
> makes both sides happy. If we manage to replace them, we can deprecate
> the stub implementation and remove it after a grace period. If we
> rename them it's not an issue either. We can rename them right away to
> a qos interface, but that does not really make a difference.
>
> What we really want to avoid is implementing an user space contract in
> a frenzy which binds us forever.

Well, that's why the QoS proposal ... it already has a userspace API ...
we'd just be extending it for statistics, which looks like a wothwhile
goal independent of android anyway.

> It's not the suspend_blockers which are the causing the nightmare,
> it's solely the drivers itself especially when there are different
> implementations in both trees. And frankly, the drivers in android are
> not in a shape which makes them flood in within 2 weeks. That's
> serious work to get them brushed up and polished. So that gives us
> quite a period of time to solve the suspend problem.

Right, so the sooner we make it easier for the drivers to use the kernel
as their main repository, the better.

James

2010-06-06 17:21:52

by Vitaly Wool

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Matthew Garrett <[email protected]>:

> Suspend blocks prevent system suspend, not any per-device suspend.

Can you suspend a device which is holding a wake lock?

~Vitaly

2010-06-06 17:32:17

by Matthew Garrett

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 07:21:49PM +0200, Vitaly Wool wrote:
> 2010/6/6 Matthew Garrett <[email protected]>:
>
> > Suspend blocks prevent system suspend, not any per-device suspend.
>
> Can you suspend a device which is holding a wake lock?

Yes. Suspend blocks are orthogonal to runtime PM.

--
Matthew Garrett | [email protected]

2010-06-06 18:05:17

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

James,

On Sun, 6 Jun 2010, James Bottomley wrote:

> On Sun, 2010-06-06 at 17:46 +0200, Thomas Gleixner wrote:
> > On Sun, 6 Jun 2010, James Bottomley wrote:
> > >
> > > 3. We've lost sight of one of the original goals, which was to
> > > bring the android tree close enough to the kernel so that the
> > > android downstream driver and board producers don't have to
> > > choose between the android kernel and vanilla kernel.
> >
> > There are two ways to do that w/o creating a dependcy on anything.
> >
> > 1) merge the drivers w/o the suspend_blockers. It's not rocket science
> > to have a patch which brings them back for android.
>
> Well, we sort of tried this when Greg pulled some of them into the
> staging tree. The problem is that without the annotations, the drivers
> are still different, and patches won't apply, so, unsurprisingly, they
> didn't get improved or even maintained.
>
> > 2) merge the drivers with empty stub implementations for annotation.
> > android just has to patch in the real one.
>
> That's also possible. This time, we would have a cosmetically closer
> tree ... however, what's in the kernel wouldn't be compilable for
> android ... which is where all the downstream wants to test, so they'd
> still be building for the android tree ... we just might have an easier
> time of it picking up their fixes.

The downstream users will be bound to the android tree anyway until
the full set of drivers for a given platform is completely merged. So
optimistically that would be 2.6.36, which gives us a couple of
months to sort out the whole thing.

Once a driver is merged mainline and the android tree switched over to
use the mainline version, fixes can be sent both ways and that's not a
real problem.

> > While I'd prefer #1, I' not in the way of #2.
>
> I think 1 is unviable ... I'm not opposed to 2 but I'd like to try to
> get the kernel really closer to android before we go for the cosmetic
> only option.
>
> > Both ways can get the drivers into the kernel and it could/should have
> > been done right from the beginning, but now we face a situation where
> > drivers are held hostage.
> >
> > Then we can sit down more relaxed and fix the stuff in a way which
> > makes both sides happy. If we manage to replace them, we can deprecate
> > the stub implementation and remove it after a grace period. If we
> > rename them it's not an issue either. We can rename them right away to
> > a qos interface, but that does not really make a difference.
> >
> > What we really want to avoid is implementing an user space contract in
> > a frenzy which binds us forever.
>
> Well, that's why the QoS proposal ... it already has a userspace API ...
> we'd just be extending it for statistics, which looks like a wothwhile
> goal independent of android anyway.

Right, but there is no dependency for the driver merge on that.

> > It's not the suspend_blockers which are the causing the nightmare,
> > it's solely the drivers itself especially when there are different
> > implementations in both trees. And frankly, the drivers in android are
> > not in a shape which makes them flood in within 2 weeks. That's
> > serious work to get them brushed up and polished. So that gives us
> > quite a period of time to solve the suspend problem.
>
> Right, so the sooner we make it easier for the drivers to use the kernel
> as their main repository, the better.

Yep, the fastest way is to provide two stub inlines in pm.h and let
the driver flood come in.

I think all of us involved in that can do with a break, where we sit
back, calm down and rethink w/o time pressure.

Thanks,

tglx

2010-06-06 18:44:25

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 11:04 AM, Thomas Gleixner <[email protected]> wrote:
>>
>> Right, so the sooner we make it easier for the drivers to use the kernel
>> as their main repository, the better.
>
> Yep, the fastest way is to provide two stub inlines in pm.h and let
> the driver flood come in.

As mentioned previously, we didn't know this was an option (stubs
without an implementation behind them). If it is, and thus we can
simplify the driver merging process short-term while sorting out a
long-term implementation or replacement for suspend blockers, then I
think we're making real progress.

There is no reason that drivers that support suspend blockers cannot
(or should not) support regular suspend/resume or runtime
suspend/resume, so that shouldn't be a worry. The various operating
modes complement, not compete with each other (we want the lowest
possible power consumption in all states, be that fully active, fully
suspended, or anything in between -- I think everyone involved agrees
with this).

> I think all of us involved in that can do with a break, where we sit
> back, calm down and rethink w/o time pressure.

Total agreement.

Brian

2010-06-06 19:00:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sunday 06 June 2010, Matthew Garrett wrote:
> On Sun, Jun 06, 2010 at 07:21:49PM +0200, Vitaly Wool wrote:
> > 2010/6/6 Matthew Garrett <[email protected]>:
> >
> > > Suspend blocks prevent system suspend, not any per-device suspend.
> >
> > Can you suspend a device which is holding a wake lock?
>
> Yes. Suspend blocks are orthogonal to runtime PM.

Yes, they are.

The idea that they would somehow interfere with runtime PM has always been pure
nonsense. They wouldn't and there's no way in which they could.

Whether or not they are really necessary to address the Android requirements is
a different question, though, and I don't really think so right now. At least
not in the form submitted by Arve.

Also, there's some overlap between suspend blockers and the PM QoS framework,
so it seems to be a good idea to extend PM QoS (which already is in the kernel)
to express some requirements that suspend blockers are supposed to express.

Rafael

2010-06-06 19:05:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:08:34PM -0500, James Bottomley wrote:
> Well, we sort of tried this when Greg pulled some of them into the
> staging tree. The problem is that without the annotations, the drivers
> are still different, and patches won't apply, so, unsurprisingly, they
> didn't get improved or even maintained.

Err, the biggest pile in staging was meta drivers like the binder or
some oom killer magic which are flat out braindead and wrong and have
no chance going into mainline anyway. That's something different from
real hardware drivers, although a few of those made it into staging as
well if I remember correctly.

2010-06-06 19:15:16

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 12:05 PM, Christoph Hellwig <[email protected]> wrote:
> On Sun, Jun 06, 2010 at 12:08:34PM -0500, James Bottomley wrote:
>> Well, we sort of tried this when Greg pulled some of them into the
>> staging tree.  The problem is that without the annotations, the drivers
>> are still different, and patches won't apply, so, unsurprisingly, they
>> didn't get improved or even maintained.
>
> Err, the biggest pile in staging was meta drivers like the binder or
> some oom killer magic which are flat out braindead and wrong and have
> no chance going into mainline anyway.  That's something different from
> real hardware drivers, although a few of those made it into staging as
> well if I remember correctly.

I was shocked when Greg pulled the binder driver and some of the other
"generic" android drivers into staging, because it was always my
assumption that nobody upstream would want them. We did get some
bugfixes for the binder driver (thanks!) but the general reaction was
pretty much the same as yours here. I then was relatively unsurprised
when it was dropped (we find it useful, upstream finds it useless, not
much else to say).

The various SoC peripheral drivers are, I suspect, much less
contentious (modulo suspend blocker usage and any necessary kernel
style cleanup).

Brian

2010-06-06 19:24:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:15:10PM -0700, Brian Swetland wrote:
> I was shocked when Greg pulled the binder driver and some of the other
> "generic" android drivers into staging, because it was always my
> assumption that nobody upstream would want them. We did get some
> bugfixes for the binder driver (thanks!) but the general reaction was
> pretty much the same as yours here. I then was relatively unsurprised
> when it was dropped (we find it useful, upstream finds it useless, not
> much else to say).
>
> The various SoC peripheral drivers are, I suspect, much less
> contentious (modulo suspend blocker usage and any necessary kernel
> style cleanup).

Yes. That's what makes me wonder about some parts of the discussion
here. Getting the drivers for one or more of the android plattforms
in is not a problem. I'd say it could have easily been done with the
manweeks spent arguing in this and related threads.

The much bigger issues is to get android userspace running on a more
or less vanilla kernel, that is withoutthe binder, without the rather
interesting group ID based security hack^H^H^H^Hmodel, without the
suspend blocker userspace API and so on, and so on.

So for people who really care about running a mainline kernel on their
android device doing that part first on a generic ARM board in qemu
might be much better first step work. On the other hand I've heard
that various hardware vendors or parties closed to them are rather
annoyed by their drivers beeing stuck in the android tree - but that
can be easily solved by getting removing the suspend blockers (at least
temporarily), cleaning up a few bits here and there and getting them in.

It's not rocket science to get support for ARM SOC number 1654 into the
mainline kernel.

2010-06-06 19:28:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, Brian Swetland wrote:

> On Sun, Jun 6, 2010 at 11:04 AM, Thomas Gleixner <[email protected]> wrote:
> >>
> >> Right, so the sooner we make it easier for the drivers to use the kernel
> >> as their main repository, the better.
> >
> > Yep, the fastest way is to provide two stub inlines in pm.h and let
> > the driver flood come in.
>
> As mentioned previously, we didn't know this was an option (stubs
> without an implementation behind them). If it is, and thus we can

That's what Greg did when he pulled stuff into staging, but there is
no reason not to do it outside of staging as well. We can simply put
the stub functions into Documentation/feature-removal.txt to ensure
that we don't forget about them :)

> simplify the driver merging process short-term while sorting out a
> long-term implementation or replacement for suspend blockers, then I
> think we're making real progress.

Yes, that way we do not lose the annotations. Replacing them, ripping
them out or whatever we agree on, is a nobrainer.

But it gets the drivers into the tree, so they are usable outside of
android as well and the delta between android and mainline shrinks
significantly.

Thanks,

tglx

2010-06-06 19:58:17

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 12:24 PM, Christoph Hellwig <[email protected]> wrote:
>
> Yes.  That's what makes me wonder about some parts of the discussion
> here.  Getting the drivers for one or more of the android plattforms
> in is not a problem.  I'd say it could have easily been done with the
> manweeks spent arguing in this and related threads.
>
> The much bigger issues is to get android userspace running on a more
> or less vanilla kernel, that is withoutthe  binder, without the rather
> interesting group ID based security hack^H^H^H^Hmodel, without the
> suspend blocker userspace API and so on, and so on.

Somebody will have to broker a deal with the frameworks/apps folks to
get rid of the binder. They like it a lot. Of course if somebody
built a drop-in replacement for the userspace side that didn't require
a kernel driver, had the same performance characteristics, solved the
same problems, etc, they could probably make an argument for it (or
just provide it as a drop-in replacement for people who want a more
"pure" linux underneath Android, even if we didn't pick it up).

The group ID stuff works incredibly well for gating device access --
we ensure that devices that need access from various processes end up
with perms like 0660 root audio (say for a raw audio interface), and
then we assure that processes which have the "may use audio hardware"
permission are executed with audio as an additional group. We ended
up using the same model to control socket, raw socket, and bt socket
access because at the time we could not find a reasonable way to grant
or exclude such permissions on a process by process basis.
Maintaining about 20-30 lines of diffs to make that work was not a bad
tradeoff (and we don't expect those patches to go upstream). If
there's a way to accomplish this without patching the kernel, we're
all ears.

While we do maintain some assorted patches to core code (like the
permissions thing), we strongly prefer to keep our work localized to
drivers (generic to android or specific to hardware), and try to
migrate to common kernel features when possible, submit patches (like
work Mike Chan is doing around cpufreq and power measurement), etc.
Patches to core code cause more headaches when we rebase up to new
kernel versions, after all.

> So for people who really care about running a mainline kernel on their
> android device doing that part first on a generic ARM board in qemu
> might be much better first step work.

> On the other hand I've heard
> that various hardware vendors or parties closed to them are rather
> annoyed by their drivers beeing stuck in the android tree - but that
> can be easily solved by getting removing the suspend blockers (at least
> temporarily), cleaning up a few bits here and there and getting them in.

This continues to baffle me. If we (Google) are such a headache, why
not just route around us. The drivers we've written are GPLv2, the
source is out there for anyone who wants it, etc. The drivers other
people have written we have no control over at all. From my point of
view it'd be an annoyance if somebody took the code we wrote, modified
it heavily, and pushed it upstream, but fundamentally I can't stop
that from happening other than by pushing it upstream myself, first.

Blaming me because you want me to send your drivers upstream and I'm
not doing it or not doing it fast enough or not doing it the way you
like does not seem very productive. I'm not a gatekeeper for patch
submission to the mainline kernel.

Brian

2010-06-06 22:27:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

Brian,

On Sun, 6 Jun 2010, Brian Swetland wrote:

> On Sun, Jun 6, 2010 at 12:24 PM, Christoph Hellwig <[email protected]> wrote:
> >
> > Yes.  That's what makes me wonder about some parts of the discussion
> > here.  Getting the drivers for one or more of the android plattforms
> > in is not a problem.  I'd say it could have easily been done with the
> > manweeks spent arguing in this and related threads.
> >
> > The much bigger issues is to get android userspace running on a more
> > or less vanilla kernel, that is withoutthe  binder, without the rather
> > interesting group ID based security hack^H^H^H^Hmodel, without the
> > suspend blocker userspace API and so on, and so on.
>
> Somebody will have to broker a deal with the frameworks/apps folks to
> get rid of the binder. They like it a lot. Of course if somebody
> built a drop-in replacement for the userspace side that didn't require
> a kernel driver, had the same performance characteristics, solved the
> same problems, etc, they could probably make an argument for it (or
> just provide it as a drop-in replacement for people who want a more
> "pure" linux underneath Android, even if we didn't pick it up).
>
> The group ID stuff works incredibly well for gating device access --
> we ensure that devices that need access from various processes end up
> with perms like 0660 root audio (say for a raw audio interface), and
> then we assure that processes which have the "may use audio hardware"
> permission are executed with audio as an additional group. We ended
> up using the same model to control socket, raw socket, and bt socket
> access because at the time we could not find a reasonable way to grant
> or exclude such permissions on a process by process basis.
> Maintaining about 20-30 lines of diffs to make that work was not a bad
> tradeoff (and we don't expect those patches to go upstream). If
> there's a way to accomplish this without patching the kernel, we're
> all ears.
>
> While we do maintain some assorted patches to core code (like the
> permissions thing), we strongly prefer to keep our work localized to
> drivers (generic to android or specific to hardware), and try to
> migrate to common kernel features when possible, submit patches (like
> work Mike Chan is doing around cpufreq and power measurement), etc.
> Patches to core code cause more headaches when we rebase up to new
> kernel versions, after all.

thanks for the honest answer and thanks to Christoph for the
reminder!

That takes a lot of the bullshit arguments about downstream users
being hurt out of the discussion. The above problems are way more
complex to resolve than the suspend blocker details.

That's another prove why we can let the drivers flow in (in the worst
case w/o the suspend blocker stubs) and have no pressure to resolve
the suspend blocker problem yesterday.

That said, after thinking more about it, I'm advocating the stubs
solution with a clear removal / decision date constraint
(e.g. 2.6.37), as it forces all involved parties to stay tuned and not
to forget about it. I'm curious about the outcome :)

Thanks,

tglx

2010-06-07 00:01:59

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010, Matthew Garrett wrote:

> The difference between idle-based suspend and opportunistic suspend is
> that the former will continue to wake up for timers and will never be
> entered if something is using CPU, whereas the latter will be entered
> whenever no suspend blocks are held. The problem with opportunistic
> suspend is that you might make the decision to suspend simultaneusly
> with a wakeup event being received. Suspend blocks facilitate
> synchronisation between the kernel and userspace to ensure that all such
> events have been consumed and handld appropriately.

Remember that suspend takes place in several phases, the first of which
is to freeze tasks. The phases can be controlled individually by the
process carrying out a suspend, and there's nothing to prevent you from
stopping after the freezer phase. Devices won't get powered down, but
Android uses aggressive runtime power management for its devices
anyway.

If you do this then the synchronization can be carried out entirely
from userspace, with no need for kernel modifications such as suspend
blockers. And since Android can reach essentially the same low-power
state from idle as from suspend, it appears that they really don't need
any kernel changes at all.

Alan Stern

2010-06-07 01:09:00

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integrationy

Alan,

On Sun, 6 Jun 2010, Alan Stern wrote:

> On Sun, 6 Jun 2010, Matthew Garrett wrote:
>
> > The difference between idle-based suspend and opportunistic suspend is
> > that the former will continue to wake up for timers and will never be
> > entered if something is using CPU, whereas the latter will be entered
> > whenever no suspend blocks are held. The problem with opportunistic
> > suspend is that you might make the decision to suspend simultaneusly
> > with a wakeup event being received. Suspend blocks facilitate
> > synchronisation between the kernel and userspace to ensure that all such
> > events have been consumed and handld appropriately.
>
> Remember that suspend takes place in several phases, the first of which
> is to freeze tasks. The phases can be controlled individually by the
> process carrying out a suspend, and there's nothing to prevent you from
> stopping after the freezer phase. Devices won't get powered down, but
> Android uses aggressive runtime power management for its devices
> anyway.
>
> If you do this then the synchronization can be carried out entirely
> from userspace, with no need for kernel modifications such as suspend
> blockers. And since Android can reach essentially the same low-power
> state from idle as from suspend, it appears that they really don't need
> any kernel changes at all.

Well there are some things to solve:

1) the confinement of untrusted apps

2) the confinement of trusted apps firing periodic timers.

Aside of that they need to shut off undesired interrupt sources,
but that's not a real problem to solve and probably possible
today already.

#1 can be solved elegantly by cgroups. We know how to freeze the
cgroup timers when the need arises, but that's not a real problem
as all periodic timers are firing only once when the target app
does not handle them.

Though the cgroup based approach of freezing timers might be
interesting for containers c/r as well and it might be necessary to
emulate the suspend semantics of freezing CLOCK_MONOTONIC, but
that's simple to do and basically no overhead.

#2 is a tad harder, as it requires to fix the trusted apps not to fire
timers when there is nothing to do.

Though you can solve it with cgroups as well. The unfreeze problem
for real wakeups can be solved as mhelsley pointed out somewhere
else in this thread.

But that depends on user space changes ....

Though as we learned today that suspend blockers are the least of the
problems which android is facing vs. mainline and we have a plan to
get the drivers in we can relax a bit and think more about it.

I'm not saying that we should ignore the shortcomings of todays code,
but in the face of hardware which perfectly goes into the same power
state from idle as it does from suspend we can IMNSHO safely ignore
the x86/ACPI/BIOS crap and the user space wreckage (see above) and
just focus on a sane design based on current/future hardware.

That's nothing new, we do not go and make NOHZ/HIGHRES work on crappy
hardware either, even if there have been patches around to do so. We
do not make broken TSCs work, even if there are brute force ways to do
so. ....

It's more sane to say "Sorry, it does not work on your system" than
trying to make it work under all circumstances for a questionable
benefit and paying the price for it in terms of maintainability and/or
complexity.

Thanks,

tglx

2010-06-07 08:00:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, Jun 07, 2010 at 12:26:55AM +0200, Thomas Gleixner wrote:
> That takes a lot of the bullshit arguments about downstream users
> being hurt out of the discussion. The above problems are way more
> complex to resolve than the suspend blocker details.
>
> That's another prove why we can let the drivers flow in (in the worst
> case w/o the suspend blocker stubs) and have no pressure to resolve
> the suspend blocker problem yesterday.
>
> That said, after thinking more about it, I'm advocating the stubs
> solution with a clear removal / decision date constraint
> (e.g. 2.6.37), as it forces all involved parties to stay tuned and not
> to forget about it. I'm curious about the outcome :)

As long as we have that clear removal schedule I'm fine with in-kernel
suspend blocker stubs.

2010-06-07 08:04:28

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:58:10PM -0700, Brian Swetland wrote:
> Somebody will have to broker a deal with the frameworks/apps folks to
> get rid of the binder. They like it a lot. Of course if somebody
> built a drop-in replacement for the userspace side that didn't require
> a kernel driver, had the same performance characteristics, solved the
> same problems, etc, they could probably make an argument for it (or
> just provide it as a drop-in replacement for people who want a more
> "pure" linux underneath Android, even if we didn't pick it up).

This wasn't really directed at you, but rather about people talking
about running a mainline kernel on Android in this thread. As I said
this is a lot more work then sorting out the drivers - with or without
suspend blockers.

> The group ID stuff works incredibly well for gating device access --
> we ensure that devices that need access from various processes end up
> with perms like 0660 root audio (say for a raw audio interface), and
> then we assure that processes which have the "may use audio hardware"
> permission are executed with audio as an additional group. We ended
> up using the same model to control socket, raw socket, and bt socket
> access because at the time we could not find a reasonable way to grant
> or exclude such permissions on a process by process basis.
> Maintaining about 20-30 lines of diffs to make that work was not a bad
> tradeoff (and we don't expect those patches to go upstream). If
> there's a way to accomplish this without patching the kernel, we're
> all ears.

I'd have to take a look again on how this is implemented in details.
If it's just overriding the capabilities it's really hard to do in
the current model as the capabilities aren't fine grained enough
currently, even with the existing per-file and per-process capabilities.
If it's mostly overriding regular unix file permissions it's easily
doable with ACLs, or in fact just with group ownership at the filesystem
level, without kernel hacks.

2010-06-07 08:16:28

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, Jun 7, 2010 at 1:03 AM, Christoph Hellwig <[email protected]> wrote:
> On Sun, Jun 06, 2010 at 12:58:10PM -0700, Brian Swetland wrote:
>> The group ID stuff works incredibly well for gating device access --
>> we ensure that devices that need access from various processes end up
>> with perms like 0660 root audio (say for a raw audio interface), and
>> then we assure that processes which have the "may use audio hardware"
>> permission are executed with audio as an additional group.  We ended
>> up using the same model to control socket, raw socket, and bt socket
>> access because at the time we could not find a reasonable way to grant
>> or exclude such permissions on a process by process basis.
>> Maintaining about 20-30 lines of diffs to make that work was not a bad
>> tradeoff (and we don't expect those patches to go upstream).  If
>> there's a way to accomplish this without patching the kernel, we're
>> all ears.
>
> I'd have to take a look again on how this is implemented in details.
> If it's just overriding the capabilities it's really hard to do in
> the current model as the capabilities aren't fine grained enough
> currently, even with the existing per-file and per-process capabilities.
> If it's mostly overriding regular unix file permissions it's easily
> doable with ACLs, or in fact just with group ownership at the filesystem
> level, without kernel hacks.

For device nodes, we just use group ownership and it works fine with
no kernel modifications. For the "can create socket", "can create bt
socket", and "can create raw socket" permissions we ended up throwing
together a patch tying those operations to being in the appropriate
group. Obviously a hack, but it was the most straightforward solution
we could find at the time.

Brian

2010-06-07 10:25:57

by Felipe Contreras

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Matthew Garrett <[email protected]>:
> On Sun, Jun 06, 2010 at 07:21:49PM +0200, Vitaly Wool wrote:
>> 2010/6/6 Matthew Garrett <[email protected]>:
>>
>> > Suspend blocks prevent system suspend, not any per-device suspend.
>>
>> Can you suspend a device which is holding a wake lock?
>
> Yes. Suspend blocks are orthogonal to runtime PM.

In that sense yes, but as it has been stated before; if Android
user-space concentrates on getting suspend blockers right, then the
timers in user-space will not be aligned correctly, and runtime PM
wouldn't work that great. Moreover, opportunistic suspend takes the
device out of idle. So, as runtime PM gets better, there's a point
where opportunistic suspend makes the situation worst.

So they are _mostly_ orthogonal, but not completely, at least for the
analysis of suspend blockers' usefulness.

--
Felipe Contreras

2010-06-07 12:16:45

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 04:14:09 -0700 (PDT)
[email protected] wrote:

> On Sun, 6 Jun 2010, Florian Mickler wrote:
>
> > On Sun, 6 Jun 2010 12:19:08 +0200
> > Vitaly Wool <[email protected]> wrote:
> >
> >> 2010/6/6 <[email protected]>:
> >>
> >>> as an example (taken from this thread).
> >>>
> >>> system A needs to wake up to get a battery reading, store it and go back to
> >>> sleep, It does so every 10 seconds. But when it does so it only runs the one
> >>> process and then goes back to sleep.
> >>>
> >>> system B has the same need, but wakes up every 10 minutes. but when it does
> >>> so it fully wakes up and this allows the mail app to power up the radio,
> >>> connect to the Internet and start checking for new mail before oppurtunistic
> >>> sleep shuts things down (causing the mail check to fail)
> >>>
> >>> System A will last considerably longer on a battery than System B.
> >>
> >> Exactly, thanks for pointing out the specific example :)
> >>
> >> ~Vitaly
> >
> > This does not affect suspend_blockers nor does suspend_blockers
> > interfere with that.
> >
> > Suspend_blockers allow the system to suspend ("mem">/sys/power/state
> > suspend), when the userspace decides that the device is not in use.
> >
> > So implementing suspend_blockers support does not impact any
> > optimizations done to either system A nor system B.
>
> Actually, it does.
>
> system A is what's being proposed by kernel developers, where the
> untrusted stuff is in a different cgroup and what puts the system to sleep
> is 'normal' power management. It doesn't sleep as long, but when it wakes
> up the untrusted stuff is still frozen, so it doesn't stay awake long, or
> do very much.
>
> System B is suspend blockers where you are either awake or asleep, and
> when you wake up you wake up fully, but oppertunistic sleep can interrupt
> untrusted processes at any time. The system sleeps longer (as fewer things
> can wake it), but when it wakes up it's fully awake.
>
> David Lang

You say, that coming back from suspend takes the system to full power
(and everything runs) before it begins the descend into
runtime-low-power?
But are you referring to the fact that coming back
from suspend starts in the zero-idle-state (i.e. "consumes extra
power") or that all processes run when it is not suspended?

Because the latter would of course (theretically) profit from the
framework-controlled-cgroup-freeze/thaw (with and without
opportunistic suspend) while the former should be a problem that
both opportunistic suspend as well as suspend-from-idle have. Or not?

So, here is the question I'm asking myself: If System A were to be
complemented by suspend_blockers, wouldn't it still be better?

With System A you could try to do a really sophisticated
power-management scheme and so on... but as soon as you allow 3rd-Party
Apps, how do you manage their cross-dependencies? I.e. you can not
automatically detect when App1 needs App2 to function.
You need to allow all 3rd-Party apps to run as a group.

So you can perhaps partition your software stack into "untrusted
applications" and different groups of software with audited
dependencies.

If one group interacts with another group (as will be the case at least
with the "untrusted applications" group) you have to have them both
running at the same time.

This really gets pretty complex. Do you really think something like
this is better than a simple suspend? (I.e. suspend blockers or
having just one group)

Suppose you implement suspend blockers with a cgroup freeze... how do
you implement the freeze/thaw control?

Cheers,
Flo

p.s.: do you see an possibility for any kind of "priority inheritance"
in the cgroup scheme? I don't.

2010-06-07 13:01:54

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 19:21:49 +0200
Vitaly Wool <[email protected]> wrote:

> 2010/6/6 Matthew Garrett <[email protected]>:
>
> > Suspend blocks prevent system suspend, not any per-device suspend.
>
> Can you suspend a device which is holding a wake lock?
>
> ~Vitaly

If you look at the suspend blocker patchset, you'll see that the only
patches adding calls to "suspend_is_blocked()" are

PM: Opportunistic suspend support

and

PM: Abort task freezing if there is an active suspend blocker .

Both are in generic suspend code and don't affect runtime pm.

Cheers,
Flo

2010-06-07 13:20:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 2010-06-06 at 12:58 -0700, Brian Swetland wrote:
> Somebody will have to broker a deal with the frameworks/apps folks to
> get rid of the binder. They like it a lot. Of course if somebody
> built a drop-in replacement for the userspace side that didn't require
> a kernel driver, had the same performance characteristics, solved the
> same problems, etc, they could probably make an argument for it (or
> just provide it as a drop-in replacement for people who want a more
> "pure" linux underneath Android, even if we didn't pick it up).

So what's up with this Binder stuff, from what I can see its just
yet-another-CORBA. Why does it need a kernel part at all, can't you
simply run with a user-space ORB instead?

I really don't get why people keep re-inventing CORBA, there's some
really nice (free) ORBs out there, like:

http://www.cs.wustl.edu/~schmidt/TAO.html


2010-06-07 14:42:26

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integrationy

On Mon, 7 Jun 2010, Thomas Gleixner wrote:

> Alan,

Thomas:

> On Sun, 6 Jun 2010, Alan Stern wrote:

> > Remember that suspend takes place in several phases, the first of which
> > is to freeze tasks. The phases can be controlled individually by the
> > process carrying out a suspend, and there's nothing to prevent you from
> > stopping after the freezer phase. Devices won't get powered down, but
> > Android uses aggressive runtime power management for its devices
> > anyway.
> >
> > If you do this then the synchronization can be carried out entirely
> > from userspace, with no need for kernel modifications such as suspend
> > blockers. And since Android can reach essentially the same low-power
> > state from idle as from suspend, it appears that they really don't need
> > any kernel changes at all.
>
> Well there are some things to solve:
>
> 1) the confinement of untrusted apps
>
> 2) the confinement of trusted apps firing periodic timers.

No, these things don't need to be solved.

The difference between our approaches is that you're trying to find a
solution that can be used more generally, not just on Android, whereas
I'm simply trying to find a way to help Arve and Brian accomplish what
they want as easily as possible.

(Even though some kernel developers may consider their design to be
ill-conceived. So long as the objectionable features are confined to
userspace nobody will mind. And if a few small kernel patches are
needed, they will most likely be accepted if they aren't intrusive.)

This means sticking to the "big-gun" approach of system-wide suspend or
its equivalent, because Android's userspace is so unstructured and
has so many widespread (perhaps even conflicting!) requirements. And
of course the necessary changes to userspace should be minimal.

Thus, in my proposed scheme there is no need to confine untrusted apps,
just as there was no need to do so when using wakelocks in their
original form. Likewise, there is no need to worry about trusted apps
firing periodic timers: Once processes are frozen, user timers really
don't matter much.

> Aside of that they need to shut off undesired interrupt sources,
> but that's not a real problem to solve and probably possible
> today already.

Yes. They may also need to shut off undesired kernel timers. This
should be doable as well, and it would benefit all platforms.

> #1 can be solved elegantly by cgroups. We know how to freeze the
> cgroup timers when the need arises, but that's not a real problem
> as all periodic timers are firing only once when the target app
> does not handle them.

That's why I said they don't matter much.

> Though the cgroup based approach of freezing timers might be
> interesting for containers c/r as well and it might be necessary to
> emulate the suspend semantics of freezing CLOCK_MONOTONIC, but
> that's simple to do and basically no overhead.

Cgroups are not needed. The power-manager process can simply freeze
all tasks other than itself.

> #2 is a tad harder, as it requires to fix the trusted apps not to fire
> timers when there is nothing to do.

No; all you have to do is handle the trusted apps as though they were
untrusted -- just as in the original wakelock approach.

> Though you can solve it with cgroups as well. The unfreeze problem
> for real wakeups can be solved as mhelsley pointed out somewhere
> else in this thread.
>
> But that depends on user space changes ....

If you handle all the apps uniformly, very few userspace changes are
needed.

> Though as we learned today that suspend blockers are the least of the
> problems which android is facing vs. mainline and we have a plan to
> get the drivers in we can relax a bit and think more about it.

The only reason for inventing suspend blockers in the first place was
to make system suspend reliable in the face of races with wakeup
sources. If system suspend doesn't run to completion, but stops after
freezing all tasks other than the power manager, then these races can
be handled entirely within userspace. Hence there is no need to have
suspend blockers at all.

Assuming Arve doesn't find any fatal flaws in this approach, the issue
will become moot. We can relax and _avoid_ thinking about it! :-)

> I'm not saying that we should ignore the shortcomings of todays code,
> but in the face of hardware which perfectly goes into the same power
> state from idle as it does from suspend we can IMNSHO safely ignore
> the x86/ACPI/BIOS crap and the user space wreckage (see above) and
> just focus on a sane design based on current/future hardware.

You're preaching to the choir. If the Google people want to use
suspend blockers on ACPI-based systems ... well, we can resume the
arguments then. They won't be nearly as pressing.

> That's nothing new, we do not go and make NOHZ/HIGHRES work on crappy
> hardware either, even if there have been patches around to do so. We
> do not make broken TSCs work, even if there are brute force ways to do
> so. ....
>
> It's more sane to say "Sorry, it does not work on your system" than
> trying to make it work under all circumstances for a questionable
> benefit and paying the price for it in terms of maintainability and/or
> complexity.

Agreed.

Alan Stern

2010-06-07 15:51:34

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integrationy

Alan,

On Mon, 7 Jun 2010, Alan Stern wrote:
> On Mon, 7 Jun 2010, Thomas Gleixner wrote:
> > #2 is a tad harder, as it requires to fix the trusted apps not to fire
> > timers when there is nothing to do.
>
> No; all you have to do is handle the trusted apps as though they were
> untrusted -- just as in the original wakelock approach.
>
> > Though you can solve it with cgroups as well. The unfreeze problem
> > for real wakeups can be solved as mhelsley pointed out somewhere
> > else in this thread.
> >
> > But that depends on user space changes ....
>
> If you handle all the apps uniformly, very few userspace changes are
> needed.

Oh, I see. I misunderstood you. -ENOTENOUGHSLEEP

Thanks,

Thomas

2010-06-07 15:56:26

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, 6 Jun 2010 20:01:56 -0400 (EDT)
Alan Stern <[email protected]> wrote:

> And since Android can reach essentially the same low-power
> state from idle as from suspend, it appears that they really don't need
> any kernel changes at all.

Well, perhaps a hint to the scheduler to fall through as fast as
possible into deepest idle?

>
> Alan Stern
>

Cheers,
Flo

2010-06-07 18:40:37

by David Brownell

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration



--- On Mon, 6/7/10, Peter Zijlstra <[email protected]> wrote:
> So what's up with this Binder stuff, from what I can see
> its just
> yet-another-CORBA. Why does it need a kernel part at all,
> can't you
> simply run with a user-space ORB instead?
>
> I really don't get why people keep re-inventing CORBA,


That made me laugh. Do you realize that one
of the earliest objections to CORBA was "why do
people keep re-inventing RPC" ... :)

(Simple answer: the existing stuff didn't solve
enough of the right problems ... and it was easier
(in a political sense) to come up with something
new than to try fixing DCE or ONC (or whatever).
Similar answers may still apply ... last I looked
at CORBA, it didn't standardize desktop integration
(or cell-phone equivalents), and the pure user-space
versions suffered slowdowns when looking up object
bindings.


> there's some really nice (free) ORBs out there,

2010-06-07 23:16:07

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/5 Alan Stern <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>
>> Yes, we can keep all our user space suspend blockers and thaw the
>> frozen cgroup when any suspend blocker is held, but this would
>> eliminate any power advantage that freezing a cgroup has over using
>> suspend to freeze all processes. Without annotating the drivers to
>> block the cgroup freezing in the same places as we now block suspend,
>> it also prevents processes in the cgroup that we freeze from directly
>> consuming wakup events.
>
> The driver annotations don't need to block the cgroup freezing. ?They
> just need to keep the system running long enough to awaken a thread
> that will handle the wakeup event. ?(See below.) ?A pm-qos constraint
> is good enough for this.
>

I'm not sure what you mean by this, either you need to annotate the
drivers or you don't.

>> If you are referring to the approach that we don't use suspend but
>> freeze a cgroup instead, this only solves the problem of bad apps. It
>> does not help pause timers in trusted user space code and in the
>> kernel, so it does not lower our average power consumption.
>
> You can solve this problem if you restructure your "trusted" apps in
> the right way. ?Require a trusted app to guarantee that whenever it
> doesn't hold any suspend blockers, it will do nothing but wait (in a
> poll() system call for example) for a wakeup event. ?When the event
> occurs, it must then activate a suspend blocker.
>

This breaks existing apps. It effectively requires that a process that
use suspend blocker do no work that does not block suspend.

> Better yet, make it more fine-grained. ?Instead of trusted apps, have
> trusted threads. ?Freeze the untrusted threads along with everything
> else, and require the trusted threads to satisfy this guarantee.
>

This would create a minefield of possible deadlocks. You now have to
make sure that your trusted threads do not share any locks with your
untrusted threads. For instance you cannot safely call into the heap
while any threads in your process are frozen.

> In this way, while the system is idle no user timers will get renewed.
> Kernel timers are another matter, but we should be able to handle them.
> There's nothing Android-specific about wanting to reduce kernel timer
> wakeups while in a low-power mode.
>
>> And, it
>> does not solve the problem for systems that enters lower power states
>> from suspend than it can from idle. The last point my not be relevant
>> to android anymore, but desktop systems already have auto suspend and
>> it would be preferable to have a race free kernel api for this.
>
> This is an entirely different matter from the rest of the discussion.
> It would be better to consider this separately after Android's current
> problems have been addressed.
>

Yes there has not been much discussion about this, but I don't
understand why not. Automatic suspend is used outside Android, and it
has the same race conditions that suspend blockers fix.

--
Arve Hj?nnev?g

2010-06-07 23:17:18

by Linus Walleij

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/7 Peter Zijlstra <[email protected]>:
> On Sun, 2010-06-06 at 12:58 -0700, Brian Swetland wrote:
>> Somebody will have to broker a deal with the frameworks/apps folks to
>> get rid of the binder. ?They like it a lot. ?Of course if somebody
>> built a drop-in replacement for the userspace side that didn't require
>> a kernel driver, had the same performance characteristics, solved the
>> same problems, etc, they could probably make an argument for it (or
>> just provide it as a drop-in replacement for people who want a more
>> "pure" linux underneath Android, even if we didn't pick it up).
>
> So what's up with this Binder stuff, from what I can see its just
> yet-another-CORBA. Why does it need a kernel part at all, can't you
> simply run with a user-space ORB instead?
>
> I really don't get why people keep re-inventing CORBA, there's some
> really nice (free) ORBs out there, like:
>
> ?http://www.cs.wustl.edu/~schmidt/TAO.html

There was a mailthread on LKML a while back where binder was
discussed, where Dianne Hackborn explained in detail how Android
uses binder. At the time it was contrasted with D-Bus (the IPC
mechanism that has largely replaced DCOP (KDE) and Bonobo
(GNOME), the latter was actually CORBA-based).

I don't think there was any conclusion, but it was pretty clear that
binder is an Android key asset, actually the key component that
the Android people have brought with them from BeOS to
Palmsource to Android to Google, and they really really like to use
that thing.

It's built into the entire Android userspace for all IPC, except the
stuff that's handled by D-Bus instead (yes they have
both for some cases).

What sets binder aside from the others is that it's kernel-based;
things like low-latency and large buffer-passing have been mentioned
as key features of the kernel driver.

Solving binder one way is to just include it and say it's needed
to run Android, the other is to define the technical issue at hand,
which is: "can the kernel support high-speed, low-latency,
partly marshalled, large-buffer IPC?"

D-Bus (on a local machine, mind you, it can use TCP also) will use
a simple unix domain socket by:

socket(PF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0)
as can be seen here:
http://cgit.freedesktop.org/dbus/dbus/tree/dbus/dbus-sysdeps-unix.c

ACE/TAO as referenced seems to use only TCP sockets actually:
https://svn.dre.vanderbilt.edu/viewvc/Middleware/trunk/ACE/ace/Sock_Connect.cpp?view=co
Perhaps it simply uses 127.0.0.1 for local IPC. (The source is
voluminous and hard for me to navigate, perhaps someone
familiar with it can add something here.)

Then either D-Bus or TAO builds a complete marshalling stack on
top of these sockets, it's all fully abstract, fully userspace. Several
processes and dbus daemons push/pull bytes into these sockets.
I think DCOP and Bonobo basically do the same thing, by
the way.

Binder on the other hand is a large kernel module:
http://android.git.kernel.org/?p=kernel/experimental.git;a=blob;f=drivers/staging/android/binder.c;h=e13b4c4834076eb64680457049832af0b92d88b9;hb=android-2.6.34-test2

It will do some serious reference counting, handshaking back-and-forth
and so on. Basically a lot of the stuff that other IPC mechanisms
also does, but in kernelspace. (OK I'm oversimplifying, binder
is far more lightweight for one.)

The bigger question behind it all is this:

Does the kernel provide the proper support for local IPC
transport, or is there more it could do in terms of interface, latency,
throughput?

A domain socket bitsink should be enough for everybody?

So I would really like to know from the Android people why the
binder is in the kernel, after all. Could it *theoretically* be in
userspace, on top of some unix domain sockets, running as a
real-time scheduled daemon or whatever, still yielding the same
performance? Or is there some discovered limitation with current
interfaces, that everybody ought to know? Especially authors of
D-Bus and TAO etc would be very interested in this I believe.

It's not like I don't understand that it would be hard to move this
thing to userspace, it's more that I'd like to know how you think it
would be impacted by that.

Yours,
Linus Walleij

2010-06-07 23:34:33

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/6 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> >> 2010/6/5 Thomas Gleixner <[email protected]>:
>> >> > On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> >> >> >> > That download might take a minute or two, but that's not an
>> >> >> >> > justification for the crapplication to run unconfined and prevent
>> >> >> >> > lower power states.
>> >> >> >> >
>> >> >> >>
>> >> >> >> I agree, but this is not a simple problem to solve.
>> >> >> >
>> >> >> > Not with suspend blockers, but with cgroup confinement of crap, it's
>> >> >> > straight forward.
>> >> >> >
>> >> >>
>> >> >> I don't think is is straight forward. If the a process in the frozen
>> >> >> group holds a resource that a process in the unfrozen group needs, how
>> >> >> do deal with that?
>> >> >
>> >> > I'm going to fix the framework which puts the group into freeze state
>> >> > w/o making sure that there is no held shared resource. Come on it's
>> >> > not rocket science.
>> >> >
>> >>
>> >> I'm not sure which framework you are talking about here, but I don't
>> >> think there is a single framework that knows about all shared
>> >> resources.
>> >
>> > Damn, it's not me talking about "our framework", you are mentioning
>> > when it fits your needs.
>>
>> You said you were going to fix the framework. I did know if you were
>> talking about the cgroup framework, or the android user-space
>> frameworks. I don't think either has knowledge about all shared
>> resources.
>
> The cgroup freezer makes sure that there are no in kernel resources
> blocked. Of course the user space side has to do the same and it's not
> rocket science.
>

It is much easier to make sure there are no kernel resources held than
that there are no resources held in user-space. The kernel threads
have to return to a safe location before they get frozen, where user
threads are just frozen wherever they are.

--
Arve Hj?nnev?g

2010-06-07 23:37:15

by Brian Swetland

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, Jun 7, 2010 at 4:17 PM, Linus Walleij
<[email protected]> wrote:
>
> So I would really like to know from the Android people why the
> binder is in the kernel, after all. Could it *theoretically* be in
> userspace, on top of some unix domain sockets, running as a
> real-time scheduled daemon or whatever, still yielding the same
> performance? Or is there some discovered limitation with current
> interfaces, that everybody ought to know? Especially authors of
> D-Bus and TAO etc would be very interested in this I believe.
>
> It's not like I don't understand that it would be hard to move this
> thing to userspace, it's more that I'd like to know how you think it
> would be impacted by that.

Fundamentally, yes, you should be able to replicate the functionality
in userspace. We considered this during 1.0 development, but it ended
up being a lot of risk (at the point when it was discussed) compared
to using the existing driver that we had. You almost certainly would
need a central daemon to do some state and permission management as
well as track some of the refcounting, you could use EPIPE on local
sockets to detect remote process termination. You could even just use
local sockets for high level control and use shared memory for actual
message transport to avoid copy-in-copy-out overhead (another binder
driver feature).

That said, the userspace environment was built up around the binder,
relies on it heavily for all ipc (except for dbus which we use for
bluez because it just hasn't been worth the headache to maintain
alternate ipc patches for bluez), and is performance sensitive (it's
possible that you could achieve similar performance with a suitably
clever userspace implementation making use of shared memory, of
course), and the frameworks/apps folks are happy with it as is (so
talking them into replacing it may be a nontrivial exercise).

I wouldn't mind not having to maintain the kernel driver (well, not
having Arve have to maintain the kernel driver...) but building a
pure-userspace replacement would be a pretty huge undertaking,
especially given all the other work we have just with general kernel
development, bringup, etc.

Since all binder comms in userspace bottlenecks through two small
libraries (one C++, one lighter weight C), in theory you could build a
drop-in replacement and then prove it out, verify correctness and
performance, and make the argument for replacing the existing
implementation.

Debugging binder implementation issues under a full system using many
binder services and patterns like "client A calls service B which
returns and object in service C", is a bit of a nightmare. I try to
stay far away from it, myself.

Brian

2010-06-08 00:05:34

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/6 Thomas Gleixner <[email protected]>:
> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Thomas Gleixner <[email protected]>:
>> >
>> > Can you please explain in a consistent way how the application stack
>> > and the underlying framework (which exists according to android docs)
>> > is handling events and how the separation of trust level works ?
>> >
>>
>> I don't think I can, since I only know small parts of it. I know some
>
> Sigh, thats the whole reason why this discussion goes nowhere.
>

Please keep in mind that we also have third party applications and
that it is not acceptable to break them. So even if I was able to tell
you everything our framework does, you still need to make sure your
solution does not break existing apps.

> How in heavens sake should we be able to decide whether suspend
> blockers are the right and only thing which solves a problem, when the
> folks advocating suspend blockers are not able to explain the problem
> in the first place ?
>
>> events like input event go though a single thread in our system
>> process, while other events like network packets (which are also
>> wakeup events) goes directly to the app.
>
> Yes, we know that already, but that's a completely useless information
> as it does not describe the full constraints and dependencies.
>
> Lemme summarize:
>
> ?Android needs suspend blockers, because it works, but cannot explain
> ?why it works and why it only works that way.
>
> A brilliant argument to merge them - NOT.
>

Your solution changes the programming model in a way that suspend does
not. Linux allow processes to communicate with each other, and if you
freeze individual processes this breaks. For the android framework
code a lack of a timely response from an application is treated as an
error, and the user is notified that the application is misbehaving.
It may be possible to change the framework to make sure that no
processes are frozen while it is waiting for a response, but this is a
major change and applications that receive wakeup events directly from
the kernel will still be broken.

--
Arve Hj?nnev?g

2010-06-08 00:23:28

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/6 Alan Stern <[email protected]>:
> On Sat, 5 Jun 2010, Alan Stern wrote:
>
>> > If you are referring to the approach that we don't use suspend but
>> > freeze a cgroup instead, this only solves the problem of bad apps. It
>> > does not help pause timers in trusted user space code and in the
>> > kernel, so it does not lower our average power consumption.
>>
>> You can solve this problem if you restructure your "trusted" apps in
>> the right way. ?Require a trusted app to guarantee that whenever it
>> doesn't hold any suspend blockers, it will do nothing but wait (in a
>> poll() system call for example) for a wakeup event. ?When the event
>> occurs, it must then activate a suspend blocker.
>>
>> Better yet, make it more fine-grained. ?Instead of trusted apps, have
>> trusted threads. ?Freeze the untrusted threads along with everything
>> else, and require the trusted threads to satisfy this guarantee.
>>
>> In this way, while the system is idle no user timers will get renewed.
>> Kernel timers are another matter, but we should be able to handle them.
>> There's nothing Android-specific about wanting to reduce kernel timer
>> wakeups while in a low-power mode.
>
> In fact it's possible to do this with only minimal changes to the
> userspace, providing you can specify all your possible hardware wakeup
> sources. ?(On the Android this list probably isn't very large -- I
> imagine it includes the keypad, the radio link(s), the RTC, and maybe
> a few switches, buttons, or other things.)
>
> Here's how you can do it. ?Extend the userspace suspend-blocker API, so
> that each suspend blocker can optionally have an associated wakeup
> source.
>
> The power-manager process should keep a list of "active" wakeup
> sources. ?A source gets removed from the list when an associated
> suspend blocker is activated.
>

How do you do this safely? If you remove the active wakeup only when
activating the suspend blocker, you will never unblock suspend if
another wakeup event happens after user-space blocked suspend but
before user-space read the events.

Also, I'm not sure we can easily associate a wakeup event with a user
space suspend blocker. For instance when an alarm triggers it is
sometimes because of a user-space alarm and sometimes because an
in-kernel alarm.

> When the "active" list is empty and no suspend blockers are activated,
> the power manager freezes ALL other processes, trusted and untrusted
> alike. ?It then does a big poll() on all the wakeup sources. ?When the
> poll() returns, its output is used to repopulate the "active" list and
> processes are unfrozen.
>
> (You can also include some error detection: If a source remains on the
> "active" list for too long then something has gone wrong.)
>
> To do all this you don't even need to use cgroups. ?The existing PM
> implementation allows a user process to freeze everything but itself;
> that's how swsusp and related programs work.
>
> This is still a big-hammer sort of approach, but it doesn't require any
> kernel changes.
>
> Alan Stern
>
>



--
Arve Hj?nnev?g

2010-06-08 00:40:23

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

2010/6/6 Rafael J. Wysocki <[email protected]>:
> On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
>> 2010/6/5 Rafael J. Wysocki <[email protected]>:
>> > On Saturday 05 June 2010, Arve Hj?nnev?g wrote:
>> >> 2010/6/5 Thomas Gleixner <[email protected]>:
>> >> > B1;2005;0cOn Fri, 4 Jun 2010, Arve Hj?nnev?g wrote:
> ...
>> >
>> > Arve, we're still learning you have some more requirements we had no idea
>>
>> What new requirement are you talking about. Did you assume all our
>> user-space ipc calls went though a single process?
>
> No, but I didn't assume that your wakelock-holding processes depend on the
> other processes in a way that might prevent them from acquiring or dropping
> a wakelock.
>

It does not prevent it from acquiring a wakelock (assuming the already
held wakelock does not have a timeout), but it could delay it and
cause an error dialog to pop up stating that the fozen app is
misbehaving.

> ...
>> >> >> Trusted code that calls into untrusted code has to deal with the
>> >> >> untrusted code not responding, but we only want to pop up a message
>> >> >> that the application is not responding if it is misbehaving, not just
>> >> >> because it was frozen though no fault of its own.
>> >
>> > When Android starts opportunistic suspend, all applications are frozen,
>> > "trusted" as well as "untrusted", right? ?So, after they are all frozen, none
>> > of them can do anything to prevent suspend from happening, right?
>>
>> Not if you mean when we write to /sys/power/state. Processes are not
>> frozen until the last suspend blocker is released.
>
> That doesn't matter. ?In the opportunistic mode you don't need to write into
> /sys/power/state to start suspend, this is done by the kernel automatically as
> soon as the last wakelock has been released (at least this is my assumption
> about how this works). ?Now, at this point the processes that don't use
> wakelocks can't really prevent themselves from being frozen and only the
> wakelocks users can do that. ?So, if a wakelock user depends on a process
> that doesn't use wakelocks in such a way that (because of that dependence) it
> can't acquire its wakelock while processes are being frozen, things don't work
> as they are supposed to.
>
You seem to forget that we use overlapping wakelocks. A process that
need to acquire a wakelock does so before the driver it talks to
releases its wakelock. At this point no processes are frozen.

>> > Now, in my proposed approach the "untrusted" apps are frozen exactly at the
>> > point Android would start opportunistic suspend and they wouldn't be able
>> > to do anything about that anyway. ?So if one of your "trusted" apps depends
>> > on the "untrusted" ones in a way that you describe, you alread have a bug
>> > (the "trusted" app cannot prevent automatic suspend from happening even if it
>> > wants, because it depends on an "untrusted" app that has just been frozen).
>> >
>>
>> I don't think what you said here is correct. If a wakeup event happens
>> all processed are unfrozen since the driver blocks suspend.
>
> This only means that the theoretical failure you gave as an example doesn't
> happen in practice. ?No problem, then. :-)
>

If individual processes are frozen, we run into problems that we
cannot run into if we freeze and thaw all processes.

>> The app that reads this event blocks suspend before reading it. If it was
>> busy talking to a less trusted app when the event happened it still works
>> since all apps are running at this point.
>
> And how is this different from an approach with cgroup freezing? ?Apps that
> use wakelock within the current framework would use "freeze locks" to prevent
> the "untrusted" part of user space from being frozen or to thaw it. ?Where's
> the problem, then?
>

They will not be able to get wakeup events directly from the kernel.
If the kernel does not thaw processes when a wakeup event happens, the
app may never get to the point where it grabs its wakelock.

--
Arve Hj?nnev?g

2010-06-08 00:45:05

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Sun, Jun 6, 2010 at 7:43 AM, Matt Helsley <[email protected]> wrote:
> On Sun, Jun 06, 2010 at 12:36:21PM +0200, Thomas Gleixner wrote:
>> On Sat, 5 Jun 2010, Arve Hj?nnev?g wrote:
>
> <snip>
>
>> > events like input event go though a single thread in our system
>> > process, while other events like network packets (which are also
>> > wakeup events) goes directly to the app.
>
> If you want to wake up cgroup-frozen tasks for these fds perhaps your
> framework can fcnt(fd, F_SETOWN, <p[g]id>) to send SIGIO to a

How does the framework get all the fds that are used by the apps for
wakeup events?

> userspace-suspend-blocker thread/process/process group. When IO comes in, the
> suspend blocker is signalled which then unfreezes the cgroup of the frozen
> untrusted task. SIGIO works on pipes, fifos, sockets, ttys, and ptys --
> many of which are precisely the kinds of things that would connect [trusted
> and untrusted] apps. Notably absent (last I checked): inotify fds, signalfd,
> timerfd, eventfd, filesystem fds and likely more.
>
> Incidentally, this is just to show that it's not impossible to implement
> "wakeups" for cgroup-frozen tasks in userspace.
>
> Cheers,
> ? ? ? ?-Matt Helsley
>



--
Arve Hj?nnev?g

2010-06-08 00:57:22

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 6, 2010 at 5:01 PM, Alan Stern <[email protected]> wrote:
> On Sun, 6 Jun 2010, Matthew Garrett wrote:
>
>> The difference between idle-based suspend and opportunistic suspend is
>> that the former will continue to wake up for timers and will never be
>> entered if something is using CPU, whereas the latter will be entered
>> whenever no suspend blocks are held. The problem with opportunistic
>> suspend is that you might make the decision to suspend simultaneusly
>> with a wakeup event being received. Suspend blocks facilitate
>> synchronisation between the kernel and userspace to ensure that all such
>> events have been consumed and handld appropriately.
>
> Remember that suspend takes place in several phases, the first of which
> is to freeze tasks. ?The phases can be controlled individually by the
> process carrying out a suspend, and there's nothing to prevent you from
> stopping after the freezer phase. ?Devices won't get powered down, but
> Android uses aggressive runtime power management for its devices
> anyway.
>
> If you do this then the synchronization can be carried out entirely
> from userspace, with no need for kernel modifications such as suspend
> blockers. And since Android can reach essentially the same low-power
> state from idle as from suspend, it appears that they really don't need
> any kernel changes at all.
>

I don't think this is true. If you stop after the freezer phase you
still need all the suspend blockers that are held until user-space
consumes an event, otherwise it never gets consumed since user-space
is frozen.

--
Arve Hj?nnev?g

2010-06-08 01:09:51

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:

> > In fact it's possible to do this with only minimal changes to the
> > userspace, providing you can specify all your possible hardware wakeup
> > sources. ?(On the Android this list probably isn't very large -- I
> > imagine it includes the keypad, the radio link(s), the RTC, and maybe
> > a few switches, buttons, or other things.)
> >
> > Here's how you can do it. ?Extend the userspace suspend-blocker API, so
> > that each suspend blocker can optionally have an associated wakeup
> > source.
> >
> > The power-manager process should keep a list of "active" wakeup
> > sources. ?A source gets removed from the list when an associated
> > suspend blocker is activated.
> >
>
> How do you do this safely? If you remove the active wakeup only when

"remove the active wakeup" isn't a good way of expressing this. You
remove the wakeup source from the power manager's list of active
sources. This is just manipulation of a data structure internal to the
power manager; it doesn't affect the actual source.

> activating the suspend blocker, you will never unblock suspend if
> another wakeup event happens after user-space blocked suspend but
> before user-space read the events.

I'm not sure what you mean. In this scheme userspace doesn't ever
block suspends. Instead the power manager freezes and unfreezes all
the other processes. And the system never suspends, it simply goes
idle for prolonged periods of time... with all processes frozen except
the power manager, and it sitting inside a poll() system call.

It's true that under some exceptional circumstances the system would
never remove a wakeup source from the "active" list and then would
never go idle. But exactly the same problem exists with wakelocks, if
the kernel activates a wakelock and there's no user process reading the
corresponding event queue.

> Also, I'm not sure we can easily associate a wakeup event with a user
> space suspend blocker. For instance when an alarm triggers it is
> sometimes because of a user-space alarm and sometimes because an
> in-kernel alarm.

That's okay. The association is optional, and not all suspend blockers
will have an associated wakeup source. (However, each wakeup source
that needs to percolate up into userspace -- i.e., that isn't handled
internally by the kernel -- should have at least one associated suspend
blocker.) The purpose of these associations is to make explicit the
"handoff" in your original scheme, whereby a source would cause the
kernel to activate a wakelock until some user process activated its own
and then cleared the kernel's wakelock. In your scheme, the connection
between the userspace wakelock and the wakeup source is implicit; in
my scheme it is explicit.

For example, a process that uses a suspend blocker in order to read
keystrokes would obviously associate its suspend blocker with the
keypad-matrix wakeup source.

Or take your example of an alarm. To make it work in my scheme, each
user alarm would have to be implemented as a poll-able file descriptor.
Processes reading the descriptor would block until the alarm expires.
(I don't know of any driver that provides this sort of timer interface,
but it would be easy to write one. You could think of it as applying
Unix's "Everything is a file" philosophy to alarms.) Each of these
descriptors would then be a wakeup source, included among the sources
that the power manager polls for, and a suspend blocker could be
associated with it.

Alan Stern

2010-06-08 01:13:41

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:

> > Remember that suspend takes place in several phases, the first of which
> > is to freeze tasks. ?The phases can be controlled individually by the
> > process carrying out a suspend, and there's nothing to prevent you from
> > stopping after the freezer phase. ?Devices won't get powered down, but
> > Android uses aggressive runtime power management for its devices
> > anyway.
> >
> > If you do this then the synchronization can be carried out entirely
> > from userspace, with no need for kernel modifications such as suspend
> > blockers. And since Android can reach essentially the same low-power
> > state from idle as from suspend, it appears that they really don't need
> > any kernel changes at all.
> >
>
> I don't think this is true. If you stop after the freezer phase you
> still need all the suspend blockers that are held until user-space
> consumes an event, otherwise it never gets consumed since user-space
> is frozen.

No. You don't need to hold any suspend blockers; you merely need to
unfreeze userspace. Once it is unfrozen, it will eventually consume
the wakeup event.

What you _do_ need to do is to prevent userspace from getting frozen
again too soon. That's the purpose of the list of "active" wakeup
sources. So long as the power manager believes a source is still
active, it won't freeze anything.

Alan Stern

2010-06-08 01:31:33

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/7 Alan Stern <[email protected]>:
> On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:
>
>> > In fact it's possible to do this with only minimal changes to the
>> > userspace, providing you can specify all your possible hardware wakeup
>> > sources. ?(On the Android this list probably isn't very large -- I
>> > imagine it includes the keypad, the radio link(s), the RTC, and maybe
>> > a few switches, buttons, or other things.)
>> >
>> > Here's how you can do it. ?Extend the userspace suspend-blocker API, so
>> > that each suspend blocker can optionally have an associated wakeup
>> > source.
>> >
>> > The power-manager process should keep a list of "active" wakeup
>> > sources. ?A source gets removed from the list when an associated
>> > suspend blocker is activated.
>> >
>>
>> How do you do this safely? If you remove the active wakeup only when
>
> "remove the active wakeup" isn't a good way of expressing this. ?You
> remove the wakeup source from the power manager's list of active
> sources. ?This is just manipulation of a data structure internal to the
> power manager; it doesn't affect the actual source.
>
>> activating the suspend blocker, you will never unblock suspend if
>> another wakeup event happens after user-space blocked suspend but
>> before user-space read the events.
>
> I'm not sure what you mean. ?In this scheme userspace doesn't ever
> block suspends. ?Instead the power manager freezes and unfreezes all
> the other processes. ?And the system never suspends, it simply goes
> idle for prolonged periods of time... with all processes frozen except
> the power manager, and it sitting inside a poll() system call.
>
> It's true that under some exceptional circumstances the system would
> never remove a wakeup source from the "active" list and then would
> never go idle. ?But exactly the same problem exists with wakelocks, if
> the kernel activates a wakelock and there's no user process reading the
> corresponding event queue.
>

No, you have a different problem. If you open an input device and
issue the ioctl to enable the suspend blocker that blocks while the
queue is not empty then don't read the event, that is a bug that is
easy to fix. What you have is a race condition. If you read an event
that occurred after you blocked the task freezing tasks will never get
frozen again (until more events occur).

>> Also, I'm not sure we can easily associate a wakeup event with a user
>> space suspend blocker. For instance when an alarm triggers it is
>> sometimes because of a user-space alarm and sometimes because an
>> in-kernel alarm.
>
> That's okay. ?The association is optional, and not all suspend blockers
> will have an associated wakeup source. ?(However, each wakeup source
> that needs to percolate up into userspace -- i.e., that isn't handled
> internally by the kernel -- should have at least one associated suspend
> blocker.) ?The purpose of these associations is to make explicit the
> "handoff" in your original scheme, whereby a source would cause the
> kernel to activate a wakelock until some user process activated its own
> and then cleared the kernel's wakelock. ?In your scheme, the connection
> between the userspace wakelock and the wakeup source is implicit; in
> my scheme it is explicit.
>
> For example, a process that uses a suspend blocker in order to read
> keystrokes would obviously associate its suspend blocker with the
> keypad-matrix wakeup source.
>

We have multiple input devices and one thread reading from them. Do
all input devices that can generate wakeup events share a wakeup
source?

> Or take your example of an alarm. ?To make it work in my scheme, each
> user alarm would have to be implemented as a poll-able file descriptor.
> Processes reading the descriptor would block until the alarm expires.
> (I don't know of any driver that provides this sort of timer interface,
> but it would be easy to write one. ?You could think of it as applying
> Unix's "Everything is a file" philosophy to alarms.) ?Each of these
> descriptors would then be a wakeup source, included among the sources
> that the power manager polls for, and a suspend blocker could be
> associated with it.
>

It seems you would need a way to pass the wakeup source id to use from
user space to the driver and for this to work (ignoring the race if
you allow multiple alarms per file) which seems like more work than
using a suspend blocker.

--
Arve Hj?nnev?g

2010-06-08 02:17:06

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Tue, 08 Jun 2010 01:17:13 +0200, Linus Walleij said:

> So I would really like to know from the Android people why the
> binder is in the kernel, after all. Could it *theoretically* be in
> userspace, on top of some unix domain sockets, running as a
> real-time scheduled daemon or whatever, still yielding the same
> performance? Or is there some discovered limitation with current
> interfaces, that everybody ought to know?

Not an Android person, but...

How expensive is a userspace->kernel transition on Android-class hardware?
There's certainly something to be said for short-circuiting the path
source_process -> kernel -> broker_process -> kernel -> dest_process
down to the shorter source->kernel->dest


Attachments:
(No filename) (227.00 B)

2010-06-08 02:32:56

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:

> > It's true that under some exceptional circumstances the system would
> > never remove a wakeup source from the "active" list and then would
> > never go idle. ?But exactly the same problem exists with wakelocks, if
> > the kernel activates a wakelock and there's no user process reading the
> > corresponding event queue.
> >
>
> No, you have a different problem. If you open an input device and
> issue the ioctl to enable the suspend blocker that blocks while the

Um, the suspend blocker that is active while the queue is nonempty is
an in-kernel suspend blocker, right? Not a userspace suspend blocker.
Hence it doesn't have to be enabled by an ioctl. Or is this some part
of the whole wakelock design that hasn't yet been posted? As far as I
know, you intended the in-kernel suspend blocker to be enabled whenever
the input device file is open.

> queue is not empty then don't read the event, that is a bug that is
> easy to fix.

I assume you mean "you open an input device but then fail to read from
it". When that happens the device's driver will activate its in-kernel
suspend blocker, and since the input queue will never become empty, the
suspend blocker will never be deactivated. Yes, that's a bug.

> What you have is a race condition. If you read an event
> that occurred after you blocked the task freezing tasks will never get
> frozen again (until more events occur).

Sorry, I can't parse that sentence. Could you rephrase it more
grammatically?

It seems to say: "If you read an event that occurred after [something],
then tasks won't get frozen again until more events occur". Which
doesn't make sense, firstly because in my scheme reading events has no
direct connection with freezing or unfreezing tasks, and secondly
because the occurrence of events doesn't cause tasks to be frozen --
just the opposite: occurrence of events _prevents_ tasks from being
frozen.

> We have multiple input devices and one thread reading from them. Do
> all input devices that can generate wakeup events share a wakeup
> source?

Basically, a wakeup source is a file descriptor that in your scheme,
some user process would read from in order to clear an in-kernel
wakelock. Thus, if each of your input devices activates an in-kernel
wakelock that is cleared when a user process reads the device, then the
file descriptors for these input devices would each be considered a
wakeup source.

> It seems you would need a way to pass the wakeup source id to use from
> user space to the driver and for this to work

No, nothing needs to be passed from userspace to the kernel. However
the source ID (or a set of source IDs) does need to be passed to the
power manager process, probably when the suspend blocker is created.

[On rereading this, I realized it might not have been clear that in my
scheme, suspend blockers have no in-kernel component. They are
implemented entirely by IPC between the process owning the suspend
blocker and the power manager process. Would it be less confusing if I
called them something else?]

> (ignoring the race if
> you allow multiple alarms per file) which seems like more work than
> using a suspend blocker.

It's not very much more: just one additional argument to the routine
that creates a suspend blocker.

I get the impression that you don't fully understand how my scheme is
meant to work. Would some additional explanation or examples help?

Alan Stern

2010-06-08 03:06:00

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/7 Alan Stern <[email protected]>:
> On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:
>
>> > It's true that under some exceptional circumstances the system would
>> > never remove a wakeup source from the "active" list and then would
>> > never go idle. ?But exactly the same problem exists with wakelocks, if
>> > the kernel activates a wakelock and there's no user process reading the
>> > corresponding event queue.
>> >
>>
>> No, you have a different problem. If you open an input device and
>> issue the ioctl to enable the suspend blocker that blocks while the
>
> Um, the suspend blocker that is active while the queue is nonempty is
> an in-kernel suspend blocker, right? ?Not a userspace suspend blocker.
> Hence it doesn't have to be enabled by an ioctl. ?Or is this some part
> of the whole wakelock design that hasn't yet been posted? ?As far as I
> know, you intended the in-kernel suspend blocker to be enabled whenever
> the input device file is open.
>

The patch that modifies evdev (posted in this patchset) uses an ioctl
to enable the suspend blocker. Not all input devices are used for
wakeup events and those don't need to block suspend.

>> queue is not empty then don't read the event, that is a bug that is
>> easy to fix.
>
> I assume you mean "you open an input device but then fail to read from
> it". ?When that happens the device's driver will activate its in-kernel
> suspend blocker, and since the input queue will never become empty, the
> suspend blocker will never be deactivated. ?Yes, that's a bug.
>
>> What you have is a race condition. If you read an event
>> that occurred after you blocked the task freezing tasks will never get
>> frozen again (until more events occur).
>
> Sorry, I can't parse that sentence. ?Could you rephrase it more
> grammatically?
>

If you read an event that occurred after you blocked the task
freezing, then tasks will never get frozen again (until more events
occur). I think my original description was less confusing, but it
seems you got completely distracted by my use of block and unblock
suspend when referring to the user space api.


> It seems to say: "If you read an event that occurred after [something],

Block suspend, block task freezing or whatever you want to call it.

> then tasks won't get frozen again until more events occur". ?Which
> doesn't make sense, firstly because in my scheme reading events has no
> direct connection with freezing or unfreezing tasks, and secondly

It has an indirect connection. You report a wakeup event when it
occurs, but clear it when user space calls an api before reading the
event. So:
Wakeup event occurs, and the driver:
- report wakeup event type A
- queue event for delivery to user-space
User space wakes up:
- Calls api to block task freezing for event type A
Another wakeup event occurs, and the driver:
- report wakeup event type A
- queue event for delivery to user-space
User space continues:
- Read events
- Wait for more events

Result: Task are not frozen again.

> because the occurrence of events doesn't cause tasks to be frozen --
> just the opposite: occurrence of events _prevents_ tasks from being
> frozen.
>
>> We have multiple input devices and one thread reading from them. Do
>> all input devices that can generate wakeup events share a wakeup
>> source?
>
> Basically, a wakeup source is a file descriptor that in your scheme,
> some user process would read from in order to clear an in-kernel
> wakelock. ?Thus, if each of your input devices activates an in-kernel
> wakelock that is cleared when a user process reads the device, then the
> file descriptors for these input devices would each be considered a
> wakeup source.
>
>> It seems you would need a way to pass the wakeup source id to use from
>> user space to the driver and for this to work
>
> No, nothing needs to be passed from userspace to the kernel. ?However
> the source ID (or a set of source IDs) does need to be passed to the
> power manager process, probably when the suspend blocker is created.
>

Then the source id need to be passed from the kernel to user-space.

> [On rereading this, I realized it might not have been clear that in my
> scheme, suspend blockers have no in-kernel component. ?They are
> implemented entirely by IPC between the process owning the suspend
> blocker and the power manager process. ?Would it be less confusing if I
> called them something else?]

No, that is not the unclear part. What is unclear to me is where the
source IDs come from. Are they static and hardcoded in the driver and
user-space, or are they passed between the driver and user-space
client?

>
>> (ignoring the race if
>> you allow multiple alarms per file) which seems like more work than
>> using a suspend blocker.
>
> It's not very much more: just one additional argument to the routine
> that creates a suspend blocker.
>
> I get the impression that you don't fully understand how my scheme is
> meant to work. ?Would some additional explanation or examples help?
>

I don't understand how you are planning to ensure that the driver and
user-space code that consumes the real event use the same source id.
The biggest problem I have with it though is that you have created a
new race condition between reporting that a wakeup event has occurred
and processing of the real event.

--
Arve Hj?nnev?g

2010-06-08 09:10:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

On Tuesday 08 June 2010, Arve Hj?nnev?g wrote:
> 2010/6/6 Rafael J. Wysocki <[email protected]>:
> > On Sunday 06 June 2010, Arve Hj?nnev?g wrote:
...
> If individual processes are frozen, we run into problems that we
> cannot run into if we freeze and thaw all processes.

Not individual processes, but the processes that don't use wakelocks in the
Android world all together. And of course the approach has to be different,
because it's a different design, but I don't think there are any fundamental
issues you can't solve within this approach.

> >> The app that reads this event blocks suspend before reading it. If it was
> >> busy talking to a less trusted app when the event happened it still works
> >> since all apps are running at this point.
> >
> > And how is this different from an approach with cgroup freezing? Apps that
> > use wakelock within the current framework would use "freeze locks" to prevent
> > the "untrusted" part of user space from being frozen or to thaw it. Where's
> > the problem, then?
> >
>
> They will not be able to get wakeup events directly from the kernel.
> If the kernel does not thaw processes when a wakeup event happens, the
> app may never get to the point where it grabs its wakelock.

The apps that use "freeze locks" (or wakelocks) are never frozen, so I don't
think this would be a problem.

Rafael

2010-06-08 10:20:16

by Florian Mickler

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010 20:05:56 -0700
Arve Hj?nnev?g <[email protected]> wrote:
Hi,

>
> If you read an event that occurred after you blocked the task
> freezing, then tasks will never get frozen again (until more events
> occur). I think my original description was less confusing, but it
> seems you got completely distracted by my use of block and unblock
> suspend when referring to the user space api.

Here is how I understood Alan's approach:

Userspace manager (UM) does:

<...continuation of function A>
5) unblock from reading a wakeup from wakeupevents-fd
6) thaw userspace
7) return
</function A>

[userspace sees there is an event; blocks suspend at UM; processes
event; consume wakeupevent at UM; unblock suspend at UM;]

Unblocking the last suspend-blocker at the UM starts function A:

<function A>
1) non-blocking read of wakeup-events-fd (refills wakeupevents)
2) if all wakeupevents are consumed:
3a) freeze userspace
else
3b) /* wait for userspace to unblock
suspend again... this should take care of the races? */ return;
4) blocking read of wakeupevents-fd
<...for continuation see above>

You mitigate the race by freezing and unfreezing userspace. If there
occur wakeups between 3a) and 4) you will have frozen userspace in
vain.

So I think the feasibility of this solution depends on the performance
of freezing/thawing userspace. I can't judge that.

Also I _think_ this is racefree as long as you have the UM properly
serialized. Or did I overlook something?


Cheers,
Flo

2010-06-08 14:51:00

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:

> The patch that modifies evdev (posted in this patchset) uses an ioctl
> to enable the suspend blocker. Not all input devices are used for
> wakeup events and those don't need to block suspend.

But you do have a 1-1 correspondence, right? That is, the input
devices that are used for wakeup events are exactly the ones that block
suspend?


> If you read an event that occurred after you blocked the task
> freezing, then tasks will never get frozen again (until more events
> occur). I think my original description was less confusing, but it
> seems you got completely distracted by my use of block and unblock
> suspend when referring to the user space api.

I still find your wording a little confusing. Task freezing can be
prevented (a more accurate term than "blocked") by two kinds of things:
a suspend blocker or an "active" wakeup source. I'm not sure which
kind you mean here.

> It has an indirect connection. You report a wakeup event when it
> occurs, but clear it when user space calls an api before reading the
> event. So:

Yes, that's right.

> Wakeup event occurs, and the driver:
> - report wakeup event type A
> - queue event for delivery to user-space

That's not really two distinct steps. Queuing the event for delivery
to userspace involves waking up any tasks that are waiting to read the
device file; that action (calling wake_up_all() or whatever the driver
does) is how the event gets reported.

> User space wakes up:
> - Calls api to block task freezing for event type A

Again, that's a confusing way of putting it. The API you're referring
to is simply the function that activates a suspend blocker. It does
prevent task freezing, but you shouldn't say it prevents freezing for
event type A. More like the other way around: In addition to
preventing freezing, the function tells the power manager that event
type A should no longer be considered active. Thus, in a sense it
_stops_ event type A from preventing freezing.

> Another wakeup event occurs, and the driver:
> - report wakeup event type A
> - queue event for delivery to user-space

Same as above.

> User space continues:
> - Read events
> - Wait for more events
>
> Result: Task are not frozen again.

Because the suspend blocker was never deactivated. The same thing
happens with wakelocks: If a task activates a wakelock and never
deactivates it, the system won't go into opportunistic suspend again.

Here's how my scheme is meant to work:

Wakeup event for input device A occurs.

A's driver adds an entry to the input device queue and
(if the queue was empty) does wake_up_all() on the device
file's wait_queue.

The PM process returns from poll() and sees that device
file A is now readable, so it adds A to its list of active
sources and unfreezes userspace.

Some other process sees that device file A is now readable,
so it activates a suspend blocker and reads events from A.

When the PM process receives the request to activate the
suspend blocker, it removes A from its list of active
sources. But it doesn't freeze userspace yet, because now
a suspend blocker is active.

The other process consumes events from A and does other
stuff. Maybe more input data arrives while this is happening
and the process reads it. Eventually the process decides to
deactivate the suspend blocker, perhaps when no more data
is available from the device file, perhaps not.

When the PM process receives the request to deactivate the
suspend blocker, it sees that now there are no active
sources and no active suspend blockers. Therefore it
freezes userspace and does a big poll() on all possible
sources. (If there are still events on the input device
queue, the poll() returns immediately.)

Rinse and repeat.

I don't see any dangerous races there. The scheme can be made a little
more efficient by having the PM process do another poll() (with 0
timeout) just before freezing userspace; if the result indicates that a
source is active then the freezing and unfreezing can be skipped.

The big assumption here is that a user process never consumes wakeup
events without first activating a suspend blocker. This seems like a
reasonable assumption, but we can work around it if necessary.

> >> It seems you would need a way to pass the wakeup source id to use from
> >> user space to the driver and for this to work
> >
> > No, nothing needs to be passed from userspace to the kernel. ?However
> > the source ID (or a set of source IDs) does need to be passed to the
> > power manager process, probably when the suspend blocker is created.
> >
>
> Then the source id need to be passed from the kernel to user-space.

A source ID is a file descriptor. File descriptors are passed from the
kernel to userspace whenever a file is opened; I can't deny it. And
they are passed back to the kernel as part of the read() and poll()
system calls. Is that what you mean?

> No, that is not the unclear part. What is unclear to me is where the
> source IDs come from. Are they static and hardcoded in the driver and
> user-space, or are they passed between the driver and user-space
> client?

They are not static; they are file descriptors. I guess this should
have been made more clear originally, but this is still pretty new to
me too.

> I don't understand how you are planning to ensure that the driver and
> user-space code that consumes the real event use the same source id.

How can it be otherwise? The userspace code consumes the event by
reading from the device file. In order to do so, it has to use the
same file descriptor it received when it opened the device file
originally.

> The biggest problem I have with it though is that you have created a
> new race condition between reporting that a wakeup event has occurred
> and processing of the real event.

There is no race. The driver reports an event has occurred by making
the data available to be read from the device file, and the event is
processed by reading it from the device file (or at least, that's the
first step in processing the event).


There's one other thing worth mentioning. All along I've been talking
about a power manager process that coordinates all these activities.
In theory there's no reason that process couldn't be implemented as a
kernel thread. This would improve efficiency by reducing the number of
context switches, and it would change IPC calls into plain system
calls.

If you did implement it that way, it could be done as a standalone
kernel module, totally noninvasive. It would not need to be part of
the vanilla kernel and nobody would object to it.

Alan Stern

2010-06-09 01:48:38

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/8 Alan Stern <[email protected]>:
> On Mon, 7 Jun 2010, Arve Hj?nnev?g wrote:
>
>> The patch that modifies evdev (posted in this patchset) uses an ioctl
>> to enable the suspend blocker. Not all input devices are used for
>> wakeup events and those don't need to block suspend.
>
> But you do have a 1-1 correspondence, right? ?That is, the input
> devices that are used for wakeup events are exactly the ones that block
> suspend?
>

Yes.

>
>> If you read an event that occurred after you blocked the task
>> freezing, then tasks will never get frozen again (until more events
>> occur). I think my original description was less confusing, but it
>> seems you got completely distracted by my use of block and unblock
>> suspend when referring to the user space api.
>
> I still find your wording a little confusing. ?Task freezing can be
> prevented (a more accurate term than "blocked") by two kinds of things:
> a suspend blocker or an "active" wakeup source. ?I'm not sure which
> kind you mean here.

I mean prevented by a user space suspend blocker.

>
>> It has an indirect connection. You report a wakeup event when it
>> occurs, but clear it when user space calls an api before reading the
>> event. So:
>
> Yes, that's right.
>
>> Wakeup event occurs, and the driver:
>> - report wakeup event type A
>> - queue event for delivery to user-space
>
> That's not really two distinct steps. ?Queuing the event for delivery
> to userspace involves waking up any tasks that are waiting to read the
> device file; that action (calling wake_up_all() or whatever the driver
> does) is how the event gets reported.
>

If you want to ensure that more than one process see the event it has
to be two steps, but it does not affect the race I was trying to
describe.

>> User space wakes up:
>> - Calls api to block task freezing for event type A
>
> Again, that's a confusing way of putting it. ?The API you're referring
> to is simply the function that activates a suspend blocker. ?It does
> prevent task freezing, but you shouldn't say it prevents freezing for
> event type A. ?More like the other way around: In addition to
> preventing freezing, the function tells the power manager that event
> type A should no longer be considered active. ?Thus, in a sense it
> _stops_ event type A from preventing freezing.
>
>> Another wakeup event occurs, and the driver:
>> - report wakeup event type A
>> - queue event for delivery to user-space
>
> Same as above.
>
>> User space continues:
>> - Read events

Sorry, I missed the unblock task freezing step here.

>> - Wait for more events
>>
>> Result: Task are not frozen again.
>
> Because the suspend blocker was never deactivated. ?The same thing
> happens with wakelocks: If a task activates a wakelock and never
> deactivates it, the system won't go into opportunistic suspend again.

Yes, but with the sequence of events above task will not be frozen
again even if the wake-lock/suspend-blocker/task-freezing-preventer is
released.

>
> Here's how my scheme is meant to work:
>
> ? ? ? ?Wakeup event for input device A occurs.
>
> ? ? ? ?A's driver adds an entry to the input device queue and
> ? ? ? ?(if the queue was empty) does wake_up_all() on the device
> ? ? ? ?file's wait_queue.
>
> ? ? ? ?The PM process returns from poll() and sees that device
> ? ? ? ?file A is now readable, so it adds A to its list of active
> ? ? ? ?sources and unfreezes userspace.
>
> ? ? ? ?Some other process sees that device file A is now readable,
> ? ? ? ?so it activates a suspend blocker and reads events from A.
>
> ? ? ? ?When the PM process receives the request to activate the
> ? ? ? ?suspend blocker, it removes A from its list of active
> ? ? ? ?sources. ?But it doesn't freeze userspace yet, because now
> ? ? ? ?a suspend blocker is active.

If another event happens at this point don't you put A back on the
list? If so, it never gets removed.

>
> ? ? ? ?The other process consumes events from A and does other
> ? ? ? ?stuff. ?Maybe more input data arrives while this is happening
> ? ? ? ?and the process reads it. ?Eventually the process decides to
> ? ? ? ?deactivate the suspend blocker, perhaps when no more data
> ? ? ? ?is available from the device file, perhaps not.
>
> ? ? ? ?When the PM process receives the request to deactivate the
> ? ? ? ?suspend blocker, it sees that now there are no active
> ? ? ? ?sources and no active suspend blockers. ?Therefore it
> ? ? ? ?freezes userspace and does a big poll() on all possible
> ? ? ? ?sources. ?(If there are still events on the input device
> ? ? ? ?queue, the poll() returns immediately.)
>
> ? ? ? ?Rinse and repeat.
>
> I don't see any dangerous races there. ?The scheme can be made a little
> more efficient by having the PM process do another poll() (with 0
> timeout) just before freezing userspace; if the result indicates that a
> source is active then the freezing and unfreezing can be skipped.
>
> The big assumption here is that a user process never consumes wakeup
> events without first activating a suspend blocker. ?This seems like a
> reasonable assumption, but we can work around it if necessary.
>
>> >> It seems you would need a way to pass the wakeup source id to use from
>> >> user space to the driver and for this to work
>> >
>> > No, nothing needs to be passed from userspace to the kernel. ?However
>> > the source ID (or a set of source IDs) does need to be passed to the
>> > power manager process, probably when the suspend blocker is created.
>> >
>>
>> Then the source id need to be passed from the kernel to user-space.
>
> A source ID is a file descriptor. ?File descriptors are passed from the
> kernel to userspace whenever a file is opened; I can't deny it. ?And
> they are passed back to the kernel as part of the read() and poll()
> system calls. ?Is that what you mean?
>
>> No, that is not the unclear part. What is unclear to me is where the
>> source IDs come from. Are they static and hardcoded in the driver and
>> user-space, or are they passed between the driver and user-space
>> client?
>
> They are not static; they are file descriptors. ?I guess this should
> have been made more clear originally, but this is still pretty new to
> me too.
>
>> I don't understand how you are planning to ensure that the driver and
>> user-space code that consumes the real event use the same source id.
>
> How can it be otherwise? ?The userspace code consumes the event by
> reading from the device file. ?In order to do so, it has to use the
> same file descriptor it received when it opened the device file
> originally.
>
>> The biggest problem I have with it though is that you have created a
>> new race condition between reporting that a wakeup event has occurred
>> and processing of the real event.
>
> There is no race. ?The driver reports an event has occurred by making
> the data available to be read from the device file, and the event is
> processed by reading it from the device file (or at least, that's the
> first step in processing the event).
>

If the driver making data available to be read triggers a wakeup event
in the power manager process that has to be cleared by the process
reading the events, then you have a race. Since the power manager is
selecting/polling on the same file descriptor, I don't see what you
gain from linking the wakeup events to suspend blockers. If you break
this link it think can work, but it does require us to modify all code
that reads wakeup events from the kernel to register the file
descriptors they get events from. It would also require adding
poll/select support to android alarm driver, and any driver that
currently uses a wakelock with a timeout would need to notify the user
space power manager instead.

>
> There's one other thing worth mentioning. ?All along I've been talking
> about a power manager process that coordinates all these activities.
> In theory there's no reason that process couldn't be implemented as a
> kernel thread. ?This would improve efficiency by reducing the number of
> context switches, and it would change IPC calls into plain system
> calls.
>
> If you did implement it that way, it could be done as a standalone
> kernel module, totally noninvasive. ?It would not need to be part of
> the vanilla kernel and nobody would object to it.
>
> Alan Stern
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>



--
Arve Hj?nnev?g

2010-06-09 03:47:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration



On Tue, 8 Jun 2010, [email protected] wrote:
>
> having suspend blockers inside the kernel adds significant complexity, it's
> worth it only if the complexity buys you enough. In this case the question is
> if the suspend blockers would extend the sleep time enough more to matter. As
> per my other e-mail, this is an area with rapidly diminishing returns as the
> sleep times get longer.

Well, the counter-argument that nobody seems to have brought up is that
suspend blockers exist, are real code, and end up being shipped in a lot
of machines.

That's a _big_ argument in favour of them. Certainly much bigger than
arguing against them based on some complexity-arguments for an alternative
that hasn't seen any testing at all.

IOW, I would seriously hope that this discussion was more about real code
that _exists_ and does what people need. It seems to have degenerated into
something else.

Because in the end, "code talks, bullshit walks". People can complain and
suggest alternatives all they want, but you can't just argue. At some
point you need to show the code that actually solves the problem.

Linus

2010-06-09 01:16:10

by David Lang

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Mon, 7 Jun 2010, Florian Mickler wrote:

> On Sun, 6 Jun 2010 04:14:09 -0700 (PDT)
> [email protected] wrote:
>
>> On Sun, 6 Jun 2010, Florian Mickler wrote:
>>
>>> On Sun, 6 Jun 2010 12:19:08 +0200
>>> Vitaly Wool <[email protected]> wrote:
>>>
>>>> 2010/6/6 <[email protected]>:
>>>>
>>>>> as an example (taken from this thread).
>>>>>
>>>>> system A needs to wake up to get a battery reading, store it and go back to
>>>>> sleep, It does so every 10 seconds. But when it does so it only runs the one
>>>>> process and then goes back to sleep.
>>>>>
>>>>> system B has the same need, but wakes up every 10 minutes. but when it does
>>>>> so it fully wakes up and this allows the mail app to power up the radio,
>>>>> connect to the Internet and start checking for new mail before oppurtunistic
>>>>> sleep shuts things down (causing the mail check to fail)
>>>>>
>>>>> System A will last considerably longer on a battery than System B.
>>>>
>>>> Exactly, thanks for pointing out the specific example :)
>>>>
>>>> ~Vitaly
>>>
>>> This does not affect suspend_blockers nor does suspend_blockers
>>> interfere with that.
>>>
>>> Suspend_blockers allow the system to suspend ("mem">/sys/power/state
>>> suspend), when the userspace decides that the device is not in use.
>>>
>>> So implementing suspend_blockers support does not impact any
>>> optimizations done to either system A nor system B.
>>
>> Actually, it does.
>>
>> system A is what's being proposed by kernel developers, where the
>> untrusted stuff is in a different cgroup and what puts the system to sleep
>> is 'normal' power management. It doesn't sleep as long, but when it wakes
>> up the untrusted stuff is still frozen, so it doesn't stay awake long, or
>> do very much.
>>
>> System B is suspend blockers where you are either awake or asleep, and
>> when you wake up you wake up fully, but oppertunistic sleep can interrupt
>> untrusted processes at any time. The system sleeps longer (as fewer things
>> can wake it), but when it wakes up it's fully awake.
>>
>> David Lang
>
> You say, that coming back from suspend takes the system to full power
> (and everything runs) before it begins the descend into
> runtime-low-power?
> But are you referring to the fact that coming back
> from suspend starts in the zero-idle-state (i.e. "consumes extra
> power") or that all processes run when it is not suspended?

I am referring to the fact that with suspend blockers and opertunistic
suspend all processes start running when it's not suspended (because they
were all running when it was suspended)

If instead the system only wakes up the trusted processes to handle
whatever woke the system up and is then idle again, it spends less power
and time while awake.

> Because the latter would of course (theretically) profit from the
> framework-controlled-cgroup-freeze/thaw (with and without
> opportunistic suspend) while the former should be a problem that
> both opportunistic suspend as well as suspend-from-idle have. Or not?
>
> So, here is the question I'm asking myself: If System A were to be
> complemented by suspend_blockers, wouldn't it still be better?

not neccessarily.

having suspend blockers inside the kernel adds significant complexity, it's
worth it only if the complexity buys you enough. In this case the question
is if the suspend blockers would extend the sleep time enough more to
matter. As per my other e-mail, this is an area with rapidly diminishing
returns as the sleep times get longer.

> With System A you could try to do a really sophisticated
> power-management scheme and so on... but as soon as you allow 3rd-Party
> Apps, how do you manage their cross-dependencies? I.e. you can not
> automatically detect when App1 needs App2 to function.
> You need to allow all 3rd-Party apps to run as a group.
>
> So you can perhaps partition your software stack into "untrusted
> applications" and different groups of software with audited
> dependencies.
>
> If one group interacts with another group (as will be the case at least
> with the "untrusted applications" group) you have to have them both
> running at the same time.
>
> This really gets pretty complex. Do you really think something like
> this is better than a simple suspend? (I.e. suspend blockers or
> having just one group)

even if all you do is have two groups (trusted and untrusted), all you
need to do is to watch for the interaction between these two. Put the
third-party apps in the untrusted group.

depending on what security you have available, you may be able to define
more, smaller groups after using the security to make sure that there is
no overlap between them

> Suppose you implement suspend blockers with a cgroup freeze... how do
> you implement the freeze/thaw control?

I thought the answer had been provided, one of the trusted apps implements
the freeze/thaw, and everything happens in userspace.

> Cheers,
> Flo
>
> p.s.: do you see an possibility for any kind of "priority inheritance"
> in the cgroup scheme? I don't.

is there a need for it?

David Lang

2010-06-09 07:44:04

by Felipe Contreras

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
<[email protected]> wrote:
> On Tue, 8 Jun 2010, [email protected] wrote:
>>
>> having suspend blockers inside the kernel adds significant complexity, it's
>> worth it only if the complexity buys you enough. In this case the question is
>> if the suspend blockers would extend the sleep time enough more to matter. As
>> per my other e-mail, this is an area with rapidly diminishing returns as the
>> sleep times get longer.
>
> Well, the counter-argument that nobody seems to have brought up is that
> suspend blockers exist, are real code, and end up being shipped in a lot
> of machines.
>
> That's a _big_ argument in favour of them. Certainly much bigger than
> arguing against them based on some complexity-arguments for an alternative
> that hasn't seen any testing at all.
>
> IOW, I would seriously hope that this discussion was more about real code
> that _exists_ and does what people need. It seems to have degenerated into
> something else.
>
> Because in the end, "code talks, bullshit walks". People can complain and
> suggest alternatives all they want, but you can't just argue. At some
> point you need to show the code that actually solves the problem.

That's assuming there is an actual problem, which according to all the
embedded people except android, there is not.

And if there is indeed such a problem (probably not big), it might be
solved properly by the time suspend blockers are merged, or few
releases after.

Whatever the solution (or workaround) is, it would be nice if it could
be used by more than just android people, and it would also be nice to
do it without introducing user-space API that *nobody* likes and might
be quickly deprecated.

--
Felipe Contreras

2010-06-09 09:39:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wednesday 09 June 2010, Felipe Contreras wrote:
> On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> <[email protected]> wrote:
> > On Tue, 8 Jun 2010, [email protected] wrote:
> >>
> >> having suspend blockers inside the kernel adds significant complexity, it's
> >> worth it only if the complexity buys you enough. In this case the question is
> >> if the suspend blockers would extend the sleep time enough more to matter. As
> >> per my other e-mail, this is an area with rapidly diminishing returns as the
> >> sleep times get longer.
> >
> > Well, the counter-argument that nobody seems to have brought up is that
> > suspend blockers exist, are real code, and end up being shipped in a lot
> > of machines.
> >
> > That's a _big_ argument in favour of them. Certainly much bigger than
> > arguing against them based on some complexity-arguments for an alternative
> > that hasn't seen any testing at all.
> >
> > IOW, I would seriously hope that this discussion was more about real code
> > that _exists_ and does what people need. It seems to have degenerated into
> > something else.
> >
> > Because in the end, "code talks, bullshit walks". People can complain and
> > suggest alternatives all they want, but you can't just argue. At some
> > point you need to show the code that actually solves the problem.
>
> That's assuming there is an actual problem, which according to all the
> embedded people except android, there is not.

Yes, there is, but they've decided to ignore it.

> And if there is indeed such a problem (probably not big), it might be
> solved properly by the time suspend blockers are merged, or few
> releases after.

Not quite. Have you followed all of the discussion, actually?

> Whatever the solution (or workaround) is, it would be nice if it could
> be used by more than just android people, and it would also be nice to
> do it without introducing user-space API that *nobody* likes and might
> be quickly deprecated.

I agree with Linus and I don't have that much of a problem with the API that
people seem to have. In fact the much-hated user space API is just a char
device driver with 3 ioctls (that can be extended in future if need be) and
the kernel API is acceptable to me. Yes, there is some overlap between it
and PM QoS, but IMhO that overlap may be reduced over time (eg. by
using PM QoS requirements to implement suspend blockers).

To me, the question boils down to whether or not we're able to persuade the
Android people to use any other approach (eg. by demonstrating that something
else is actually better), because even if we invent a brilliant new approach,
but Android ends up using its old one anyway, the net result will be as though
we haven't done anything useful.

Rafael

2010-06-09 12:34:47

by Mark Brown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Sun, Jun 06, 2010 at 12:58:10PM -0700, Brian Swetland wrote:
> On Sun, Jun 6, 2010 at 12:24 PM, Christoph Hellwig <[email protected]> wrote:

> > On the other hand I've heard
> > that various hardware vendors or parties closed to them are rather
> > annoyed by their drivers beeing stuck in the android tree - but that
> > can be easily solved by getting removing the suspend blockers (at least
> > temporarily), cleaning up a few bits here and there and getting them in.

> This continues to baffle me. If we (Google) are such a headache, why
> not just route around us. The drivers we've written are GPLv2, the
> source is out there for anyone who wants it, etc. The drivers other
> people have written we have no control over at all. From my point of
> view it'd be an annoyance if somebody took the code we wrote, modified
> it heavily, and pushed it upstream, but fundamentally I can't stop
> that from happening other than by pushing it upstream myself, first.

AFAICT this is purely down to the fact that the vendors producing
Android devices are using the kernel which is shipped with whatever
release they are using so people doing drivers end up getting locked in
to an older kernel with old APIs (independant of Android specifics) and
don't have the resource to redo things for upstream. Suspend blockers
are one more API update in there, but general kernel development creates
far more.

I was looking at this just today and one thing that it occurs to me
might help is if when you guys rebase your work against upstream you
were to tag the results - at the minute the only "release" Android
kernels are those included in full stack releases so providing more
hints that other kernel versions could be substituted in may help.

2010-06-09 15:29:38

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Tue, 8 Jun 2010, Arve Hj?nnev?g wrote:

> >> Wakeup event occurs, and the driver:
> >> - report wakeup event type A
> >> - queue event for delivery to user-space
> >
> > That's not really two distinct steps. ?Queuing the event for delivery
> > to userspace involves waking up any tasks that are waiting to read the
> > device file; that action (calling wake_up_all() or whatever the driver
> > does) is how the event gets reported.
> >
>
> If you want to ensure that more than one process see the event it has
> to be two steps, but it does not affect the race I was trying to
> describe.

Are you sure about that? If two processes call poll() for the same
file descriptor, don't both calls return when data becomes available?
But agreed, it doesn't matter -- especially since I only need one
process (the power manager) to see the event.

> >> User space wakes up:
> >> - Calls api to block task freezing for event type A
> >
> > Again, that's a confusing way of putting it. ?The API you're referring
> > to is simply the function that activates a suspend blocker. ?It does
> > prevent task freezing, but you shouldn't say it prevents freezing for
> > event type A. ?More like the other way around: In addition to
> > preventing freezing, the function tells the power manager that event
> > type A should no longer be considered active. ?Thus, in a sense it
> > _stops_ event type A from preventing freezing.
> >
> >> Another wakeup event occurs, and the driver:
> >> - report wakeup event type A

I think this is where you misunderstood. There is no "report wakeup
event" as such. All that happens is that data becomes available to be
read from the input device file. However the power manager process
isn't polling the device file at this point (because a suspend blocker
is active), so it doesn't realize that the source has become active
again.

> >> - queue event for delivery to user-space
> >
> > Same as above.
> >
> >> User space continues:
> >> - Read events
>
> Sorry, I missed the unblock task freezing step here.
>
> >> - Wait for more events
> >>
> >> Result: Task are not frozen again.
> >
> > Because the suspend blocker was never deactivated. ?The same thing
> > happens with wakelocks: If a task activates a wakelock and never
> > deactivates it, the system won't go into opportunistic suspend again.
>
> Yes, but with the sequence of events above task will not be frozen
> again even if the wake-lock/suspend-blocker/task-freezing-preventer is
> released.

Yes they will. When the suspend blocker is deactivated, the power
manager process will realize that there are no active suspend blockers
and it will think there are no active sources. Thus it will freeze
processes as usual.

> > Here's how my scheme is meant to work:
> >
> > ? ? ? ?Wakeup event for input device A occurs.
> >
> > ? ? ? ?A's driver adds an entry to the input device queue and
> > ? ? ? ?(if the queue was empty) does wake_up_all() on the device
> > ? ? ? ?file's wait_queue.
> >
> > ? ? ? ?The PM process returns from poll() and sees that device
> > ? ? ? ?file A is now readable, so it adds A to its list of active
> > ? ? ? ?sources and unfreezes userspace.
> >
> > ? ? ? ?Some other process sees that device file A is now readable,
> > ? ? ? ?so it activates a suspend blocker and reads events from A.
> >
> > ? ? ? ?When the PM process receives the request to activate the
> > ? ? ? ?suspend blocker, it removes A from its list of active
> > ? ? ? ?sources. ?But it doesn't freeze userspace yet, because now
> > ? ? ? ?a suspend blocker is active.
>
> If another event happens at this point don't you put A back on the
> list? If so, it never gets removed.

No, you don't put A back on the list. Sources get put on the list only
when the information returned by poll() indicates they have data
available. The power manager doesn't poll while suspend blockers are
active.

> > ? ? ? ?The other process consumes events from A and does other
> > ? ? ? ?stuff. ?Maybe more input data arrives while this is happening
> > ? ? ? ?and the process reads it. ?Eventually the process decides to
> > ? ? ? ?deactivate the suspend blocker, perhaps when no more data
> > ? ? ? ?is available from the device file, perhaps not.
> >
> > ? ? ? ?When the PM process receives the request to deactivate the
> > ? ? ? ?suspend blocker, it sees that now there are no active
> > ? ? ? ?sources and no active suspend blockers. ?Therefore it
> > ? ? ? ?freezes userspace and does a big poll() on all possible
> > ? ? ? ?sources. ?(If there are still events on the input device
> > ? ? ? ?queue, the poll() returns immediately.)
> >
> > ? ? ? ?Rinse and repeat.
> >
> > I don't see any dangerous races there. ?The scheme can be made a little
> > more efficient by having the PM process do another poll() (with 0
> > timeout) just before freezing userspace; if the result indicates that a
> > source is active then the freezing and unfreezing can be skipped.

> > There is no race. ?The driver reports an event has occurred by making
> > the data available to be read from the device file, and the event is
> > processed by reading it from the device file (or at least, that's the
> > first step in processing the event).
> >
>
> If the driver making data available to be read triggers a wakeup event
> in the power manager process

It doesn't. Only return from a poll() causes the power manager process
to think a wakeup event has occurred.

> that has to be cleared by the process
> reading the events, then you have a race. Since the power manager is
> selecting/polling on the same file descriptor, I don't see what you
> gain from linking the wakeup events to suspend blockers.

What I gain is the ability to know when an in-kernel wakelock _could_
have been released, without actually implementing in-kernel wakelocks.

With real in-kernel wakelocks, the wakelock is released when the input
queue becomes empty. There's no way for the power manager process to
know exactly when that happens without modifying the kernel. However
we can use the activation of the corresponding userspace suspend
blocker as a proxy. It's nearly as good, and it gets the job done.

If you prefer, an interface could be added whereby a user process tells
the power manager explicitly that it's going to read data from an
input device, instead of relying on implicit notification through
suspend blocker activation. I don't know whether this would be simpler
or more complex; it depends on the design of your userspace.

> If you break
> this link it think can work, but it does require us to modify all code
> that reads wakeup events from the kernel to register the file
> descriptors they get events from.

Yes. I don't know how your user code is structured; if there is a
fixed correspondence between file descriptors and suspend blockers (the
same wakeup events are always handled by the same suspend blockers)
then this will be a simple change -- the file descriptor can be
registered when the suspend blocker is created.

If the correspondence is more dynamic (different suspend blockers used
for the same wakeup device at different times, or multiple wakeup
devices handled by one suspend blocker) then the required changes will
be more complicated. Not tremendously more.

> It would also require adding
> poll/select support to android alarm driver,

Yes. Is this a platform-specific driver? (I assume so, since you
called it the "android alarm driver".) Then poll/select support can be
added without provoking a lot of objections from legions of kernel
developers.

> and any driver that
> currently uses a wakelock with a timeout would need to notify the user
> space power manager instead.

Hmm. This is symptomatic of a deficiency in the original wakelock
implementation -- those timeouts always were arbitrary.

The power manager would indeed have to know about wakeup devices that
don't need to _keep_ the system awake. Here's one way to cope: During
those times when no suspend blockers are active but the PM process
thinks a wakeup source is active, the PM process could poll every few
seconds to update its list of active sources. At those points it could
remove wakeup sources that have timed out.


Obviously this proposal would complicate your userspace. Not
enormously, since most of the work is confined to the power manager,
but somewhat. That's the price to be paid for leaving the kernel
essentially untouched. Consider the amount of resistance your
wakelock/suspend-blocker patches have already received; you'll have to
decide which approach will work out better in the end.

Alan Stern

2010-06-09 22:04:48

by NeilBrown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, 9 Jun 2010 11:40:27 +0200
"Rafael J. Wysocki" <[email protected]> wrote:

> On Wednesday 09 June 2010, Felipe Contreras wrote:
> > On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> > <[email protected]> wrote:
> > > On Tue, 8 Jun 2010, [email protected] wrote:
> > >>
> > >> having suspend blockers inside the kernel adds significant complexity, it's
> > >> worth it only if the complexity buys you enough. In this case the question is
> > >> if the suspend blockers would extend the sleep time enough more to matter. As
> > >> per my other e-mail, this is an area with rapidly diminishing returns as the
> > >> sleep times get longer.
> > >
> > > Well, the counter-argument that nobody seems to have brought up is that
> > > suspend blockers exist, are real code, and end up being shipped in a lot
> > > of machines.
> > >
> > > That's a _big_ argument in favour of them. Certainly much bigger than
> > > arguing against them based on some complexity-arguments for an alternative
> > > that hasn't seen any testing at all.
> > >
> > > IOW, I would seriously hope that this discussion was more about real code
> > > that _exists_ and does what people need. It seems to have degenerated into
> > > something else.
> > >
> > > Because in the end, "code talks, bullshit walks". People can complain and
> > > suggest alternatives all they want, but you can't just argue. At some
> > > point you need to show the code that actually solves the problem.
> >
> > That's assuming there is an actual problem, which according to all the
> > embedded people except android, there is not.
>
> Yes, there is, but they've decided to ignore it.
>
> > And if there is indeed such a problem (probably not big), it might be
> > solved properly by the time suspend blockers are merged, or few
> > releases after.
>
> Not quite. Have you followed all of the discussion, actually?
>
> > Whatever the solution (or workaround) is, it would be nice if it could
> > be used by more than just android people, and it would also be nice to
> > do it without introducing user-space API that *nobody* likes and might
> > be quickly deprecated.
>
> I agree with Linus and I don't have that much of a problem with the API that
> people seem to have. In fact the much-hated user space API is just a char
> device driver with 3 ioctls (that can be extended in future if need be) and
> the kernel API is acceptable to me.

I think there is a little bit more to it than that. It seems there is a new
ioctl for input/event devices to say "Any events queued here should be
treated as wake-up events". There may be similar additions to other devices,
but I know of no details.

I wonder if we can get a complete statement of changes to the user-space
API...

> Yes, there is some overlap between it
> and PM QoS, but IMhO that overlap may be reduced over time (eg. by
> using PM QoS requirements to implement suspend blockers).
>
> To me, the question boils down to whether or not we're able to persuade the
> Android people to use any other approach (eg. by demonstrating that something
> else is actually better), because even if we invent a brilliant new approach,
> but Android ends up using its old one anyway, the net result will be as though
> we haven't done anything useful.

Yes. There is no point unless we can meet somewhere in the middle. I think
that would have to include a full suspend that freezes all processes.
Solutions which reject that - while quite clever - would require too much
change to Android user-space to be acceptable.

NeilBrown

>
> Rafael

2010-06-09 23:42:57

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/9 Alan Stern <[email protected]>:
> On Tue, 8 Jun 2010, Arve Hj?nnev?g wrote:
>
>> >> Wakeup event occurs, and the driver:
>> >> - report wakeup event type A
>> >> - queue event for delivery to user-space
>> >
>> > That's not really two distinct steps. ?Queuing the event for delivery
>> > to userspace involves waking up any tasks that are waiting to read the
>> > device file; that action (calling wake_up_all() or whatever the driver
>> > does) is how the event gets reported.
>> >
>>
>> If you want to ensure that more than one process see the event it has
>> to be two steps, but it does not affect the race I was trying to
>> describe.
>
> Are you sure about that? ?If two processes call poll() for the same
> file descriptor, don't both calls return when data becomes available?

Yes if they are both already in the poll call they both return, but if
one process reads the data while the second process is not in the poll
call the second process will not see anything.

> But agreed, it doesn't matter -- especially since I only need one
> process (the power manager) to see the event.

The power may not see the event, the process that reads the event will
always see it. If the power manager is not in the poll call when the
event happens, the process that reads the event can read the event
before the power manager calls poll.

>
>> >> User space wakes up:
>> >> - Calls api to block task freezing for event type A
>> >
>> > Again, that's a confusing way of putting it. ?The API you're referring
>> > to is simply the function that activates a suspend blocker. ?It does
>> > prevent task freezing, but you shouldn't say it prevents freezing for
>> > event type A. ?More like the other way around: In addition to
>> > preventing freezing, the function tells the power manager that event
>> > type A should no longer be considered active. ?Thus, in a sense it
>> > _stops_ event type A from preventing freezing.
>> >
>> >> Another wakeup event occurs, and the driver:
>> >> - report wakeup event type A
>
> I think this is where you misunderstood. ?There is no "report wakeup
> event" as such. ?All that happens is that data becomes available to be
> read from the input device file. ?However the power manager process
> isn't polling the device file at this point (because a suspend blocker
> is active), so it doesn't realize that the source has become active
> again.
>

Yes this is not what I though you were suggesting. I thought you were
trying to make sure the power manager sees all wakeup events. If you
are only listening for wakeup events while no suspend blockers are
active, why latch them?

>> >> - queue event for delivery to user-space
>> >
>> > Same as above.
>> >
>> >> User space continues:
>> >> - Read events
>>
>> Sorry, I missed the unblock task freezing step here.
>>
>> >> - Wait for more events
>> >>
>> >> Result: Task are not frozen again.
>> >
>> > Because the suspend blocker was never deactivated. ?The same thing
>> > happens with wakelocks: If a task activates a wakelock and never
>> > deactivates it, the system won't go into opportunistic suspend again.
>>
>> Yes, but with the sequence of events above task will not be frozen
>> again even if the wake-lock/suspend-blocker/task-freezing-preventer is
>> released.
>
> Yes they will. ?When the suspend blocker is deactivated, the power
> manager process will realize that there are no active suspend blockers
> and it will think there are no active sources. ?Thus it will freeze
> processes as usual.
>
>> > Here's how my scheme is meant to work:
>> >
>> > ? ? ? ?Wakeup event for input device A occurs.
>> >
>> > ? ? ? ?A's driver adds an entry to the input device queue and
>> > ? ? ? ?(if the queue was empty) does wake_up_all() on the device
>> > ? ? ? ?file's wait_queue.
>> >
>> > ? ? ? ?The PM process returns from poll() and sees that device
>> > ? ? ? ?file A is now readable, so it adds A to its list of active
>> > ? ? ? ?sources and unfreezes userspace.
>> >
>> > ? ? ? ?Some other process sees that device file A is now readable,
>> > ? ? ? ?so it activates a suspend blocker and reads events from A.
>> >
>> > ? ? ? ?When the PM process receives the request to activate the
>> > ? ? ? ?suspend blocker, it removes A from its list of active
>> > ? ? ? ?sources. ?But it doesn't freeze userspace yet, because now
>> > ? ? ? ?a suspend blocker is active.
>>
>> If another event happens at this point don't you put A back on the
>> list? If so, it never gets removed.
>
> No, you don't put A back on the list. ?Sources get put on the list only
> when the information returned by poll() indicates they have data
> available. ?The power manager doesn't poll while suspend blockers are
> active.
>
>> > ? ? ? ?The other process consumes events from A and does other
>> > ? ? ? ?stuff. ?Maybe more input data arrives while this is happening
>> > ? ? ? ?and the process reads it. ?Eventually the process decides to
>> > ? ? ? ?deactivate the suspend blocker, perhaps when no more data
>> > ? ? ? ?is available from the device file, perhaps not.
>> >
>> > ? ? ? ?When the PM process receives the request to deactivate the
>> > ? ? ? ?suspend blocker, it sees that now there are no active
>> > ? ? ? ?sources and no active suspend blockers. ?Therefore it
>> > ? ? ? ?freezes userspace and does a big poll() on all possible
>> > ? ? ? ?sources. ?(If there are still events on the input device
>> > ? ? ? ?queue, the poll() returns immediately.)
>> >
>> > ? ? ? ?Rinse and repeat.
>> >
>> > I don't see any dangerous races there. ?The scheme can be made a little
>> > more efficient by having the PM process do another poll() (with 0
>> > timeout) just before freezing userspace; if the result indicates that a
>> > source is active then the freezing and unfreezing can be skipped.
>
>> > There is no race. ?The driver reports an event has occurred by making
>> > the data available to be read from the device file, and the event is
>> > processed by reading it from the device file (or at least, that's the
>> > first step in processing the event).
>> >
>>
>> If the driver making data available to be read triggers a wakeup event
>> in the power manager process
>
> It doesn't. ?Only return from a poll() causes the power manager process
> to think a wakeup event has occurred.
>
>> that has to be cleared by the process
>> reading the events, then you have a race. Since the power manager is
>> selecting/polling on the same file descriptor, I don't see what you
>> gain from linking the wakeup events to suspend blockers.
>
> What I gain is the ability to know when an in-kernel wakelock _could_
> have been released, without actually implementing in-kernel wakelocks.
>
> With real in-kernel wakelocks, the wakelock is released when the input
> queue becomes empty. ?There's no way for the power manager process to
> know exactly when that happens without modifying the kernel. ?However
> we can use the activation of the corresponding userspace suspend
> blocker as a proxy. ?It's nearly as good, and it gets the job done.
>

If you only poll the fd after the last user-space suspend blocker is
released, why do you care when the kernel wakelock could have been
released? It seem the only thing it saves you is an extra poll call
when two wakeup events happen at the same time and one of them is
fully processed and unblocks suspend before the other event handler
blocks suspend. It seems strange to remove your wakeup event from the
list when a specific suspend blocker is acquired when any suspend
blocker will prevent that wakeup event from being added to the list in
the first place.

> If you prefer, an interface could be added whereby a user process tells
> the power manager explicitly that it's going to read data from an
> input device, instead of relying on implicit notification through
> suspend blocker activation. ?I don't know whether this would be simpler
> or more complex; it depends on the design of your userspace.
>

I don't think there is a need to tie the fds to anything else. If you
poll the fds on the last suspend unblock call, you should get the same
behaviour.

>> If you break
>> this link it think can work, but it does require us to modify all code
>> that reads wakeup events from the kernel to register the file
>> descriptors they get events from.
>
> Yes. ?I don't know how your user code is structured; if there is a
> fixed correspondence between file descriptors and suspend blockers (the
> same wakeup events are always handled by the same suspend blockers)
> then this will be a simple change -- the file descriptor can be
> registered when the suspend blocker is created.
>
> If the correspondence is more dynamic (different suspend blockers used
> for the same wakeup device at different times, or multiple wakeup
> devices handled by one suspend blocker) then the required changes will
> be more complicated. ?Not tremendously more.

All input events that can wake the system are handled by one
user-space suspend blocker. Input devices come and go so we would need
to add and remove the fds dynamically.

>
>> It would also require adding
>> poll/select support to android alarm driver,
>
> Yes. ?Is this a platform-specific driver? ?(I assume so, since you
> called it the "android alarm driver".) ?Then poll/select support can be
> added without provoking a lot of objections from legions of kernel
> developers.
>

It is not platform specific, but it is not currently in the mainline
kernel so I would not expect any objections to a change that adding
poll/select support.

>> and any driver that
>> currently uses a wakelock with a timeout would need to notify the user
>> space power manager instead.
>
> Hmm. ?This is symptomatic of a deficiency in the original wakelock
> implementation -- those timeouts always were arbitrary.
>
> The power manager would indeed have to know about wakeup devices that
> don't need to _keep_ the system awake. ?Here's one way to cope: During
> those times when no suspend blockers are active but the PM process
> thinks a wakeup source is active, the PM process could poll every few
> seconds to update its list of active sources. ?At those points it could
> remove wakeup sources that have timed out.
>

For that to work the wakeup events would have to be reported to the
power manager in a reliable way in the first place. Passing the file
descriptor that the app uses to the power manager does not work for
this, since the app could read the event while the power manager was
not in the poll call and the power manager would never see it. Also,
existing apps don't pass their file descriptors to the power manager,
so it has the get the event from somewhere else.

>
> Obviously this proposal would complicate your userspace. ?Not
> enormously, since most of the work is confined to the power manager,

No, the main problem it that it is not confined to the power manager.
The power manager does not have a list of file descriptors to monitor,
so we have to modify all code that handles wakeup events. This
includes vendor supplied code that we don't have the source for, but,
on some platforms at least, this code relies on kernel wakelocks with
a timeout and could at first be handled the same way we would have to
handle existing apps reading from a socket.

> but somewhat. ?That's the price to be paid for leaving the kernel
> essentially untouched. ?Consider the amount of resistance your
> wakelock/suspend-blocker patches have already received; you'll have to
> decide which approach will work out better in the end.
>

The suspend blocker approach is more generally useful since it
supports hardware where suspend is needed. Why this argument is being
ignored is very puzzling.

Your solution is not immediately useful since depends on removing all
periodic kernel timers and adding support to stopping the monotonic
clock for set of processes that are frozen.

Your solution forces the user space interface to be more complicated,
and it creates a new user space power manager that has to start before
any process that handle wakeup events.

--
Arve Hj?nnev?g

2010-06-10 04:22:07

by David Lang

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, 9 Jun 2010, Arve Hj?nnev?g wrote:

>
> The power may not see the event, the process that reads the event will
> always see it. If the power manager is not in the poll call when the
> event happens, the process that reads the event can read the event
> before the power manager calls poll.
>

>
> All input events that can wake the system are handled by one
> user-space suspend blocker. Input devices come and go so we would need
> to add and remove the fds dynamically.


>
> For that to work the wakeup events would have to be reported to the
> power manager in a reliable way in the first place. Passing the file
> descriptor that the app uses to the power manager does not work for
> this, since the app could read the event while the power manager was
> not in the poll call and the power manager would never see it. Also,
> existing apps don't pass their file descriptors to the power manager,
> so it has the get the event from somewhere else.
>

why could the suspend blocker process see all events, but the power
manager process not see the events?

have the userspace talk to the power manager the way it does to the
suspend blocker now and what's the difference?

effectivly think s/suspend blocker/power manager/ (with the power manager
doing all the other things that are proposed instead of grabbing the
wakelock), the difference should be hidden to the rest of userspace.

what am I missing here?

David Lang

2010-06-10 04:51:41

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/9 <[email protected]>:
> On Wed, 9 Jun 2010, Arve Hj?nnev?g wrote:
>
>>
>> The power may not see the event, the process that reads the event will
>> always see it. If the power manager is not in the poll call when the
>> event happens, the process that reads the event can read the event
>> before the power manager calls poll.
>>
>
>>
>> All input events that can wake the system are handled by one
>> user-space suspend blocker. Input devices come and go so we would need
>> to add and remove the fds dynamically.
>
>
>>
>> For that to work the wakeup events would have to be reported to the
>> power manager in a reliable way in the first place. Passing the file
>> descriptor that the app uses to the power manager does not work for
>> this, since the app could read the event while the power manager was
>> not in the poll call and the power manager would never see it. Also,
>> existing apps don't pass their file descriptors to the power manager,
>> so it has the get the event from somewhere else.
>>
>
> why could the suspend blocker process see all events, but the power manager
> process not see the events?
>

Because in this proposal the power manager only looks for the events
(on the same queue) when no user space suspend blockers are active.

> have the userspace talk to the power manager the way it does to the suspend
> blocker now and what's the difference?
>
> effectivly think s/suspend blocker/power manager/ (with the power manager
> doing all the other things that are proposed instead of grabbing the
> wakelock), the difference should be hidden to the rest of userspace.
>
> what am I missing here?
>

The current user space interface does not require that clients
register the file descriptors that they get wakeup events from with
another process.

--
Arve Hj?nnev?g

2010-06-10 05:11:14

by NeilBrown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, 9 Jun 2010 21:51:38 -0700
Arve Hjønnevåg <[email protected]> wrote:

> The current user space interface does not require that clients
> register the file descriptors that they get wakeup events from with
> another process.
>

However I believe they *do* register these file descriptors with the kernel,
via some sort of ioctl (I think you have said that is the case for input
devices at least).
Can you confirm that?

If that is the case, is it really such a big change to register with another
process instead of with the kernel?

Thanks,
NeilBrown

2010-06-10 09:01:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thursday, June 10, 2010, Neil Brown wrote:
> On Wed, 9 Jun 2010 11:40:27 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > On Wednesday 09 June 2010, Felipe Contreras wrote:
> > > On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> > > <[email protected]> wrote:
> > > > On Tue, 8 Jun 2010, [email protected] wrote:
> > > >>
> > > >> having suspend blockers inside the kernel adds significant complexity, it's
> > > >> worth it only if the complexity buys you enough. In this case the question is
> > > >> if the suspend blockers would extend the sleep time enough more to matter. As
> > > >> per my other e-mail, this is an area with rapidly diminishing returns as the
> > > >> sleep times get longer.
> > > >
> > > > Well, the counter-argument that nobody seems to have brought up is that
> > > > suspend blockers exist, are real code, and end up being shipped in a lot
> > > > of machines.
> > > >
> > > > That's a _big_ argument in favour of them. Certainly much bigger than
> > > > arguing against them based on some complexity-arguments for an alternative
> > > > that hasn't seen any testing at all.
> > > >
> > > > IOW, I would seriously hope that this discussion was more about real code
> > > > that _exists_ and does what people need. It seems to have degenerated into
> > > > something else.
> > > >
> > > > Because in the end, "code talks, bullshit walks". People can complain and
> > > > suggest alternatives all they want, but you can't just argue. At some
> > > > point you need to show the code that actually solves the problem.
> > >
> > > That's assuming there is an actual problem, which according to all the
> > > embedded people except android, there is not.
> >
> > Yes, there is, but they've decided to ignore it.
> >
> > > And if there is indeed such a problem (probably not big), it might be
> > > solved properly by the time suspend blockers are merged, or few
> > > releases after.
> >
> > Not quite. Have you followed all of the discussion, actually?
> >
> > > Whatever the solution (or workaround) is, it would be nice if it could
> > > be used by more than just android people, and it would also be nice to
> > > do it without introducing user-space API that *nobody* likes and might
> > > be quickly deprecated.
> >
> > I agree with Linus and I don't have that much of a problem with the API that
> > people seem to have. In fact the much-hated user space API is just a char
> > device driver with 3 ioctls (that can be extended in future if need be) and
> > the kernel API is acceptable to me.
>
> I think there is a little bit more to it than that. It seems there is a new
> ioctl for input/event devices to say "Any events queued here should be
> treated as wake-up events". There may be similar additions to other devices,
> but I know of no details.

That's not in the patchset as in my pull request.

It is used on Android, though, and it would have been submitted separately, had
the first patchset been merged.

> I wonder if we can get a complete statement of changes to the user-space
> API...
>
> > Yes, there is some overlap between it
> > and PM QoS, but IMhO that overlap may be reduced over time (eg. by
> > using PM QoS requirements to implement suspend blockers).
> >
> > To me, the question boils down to whether or not we're able to persuade the
> > Android people to use any other approach (eg. by demonstrating that something
> > else is actually better), because even if we invent a brilliant new approach,
> > but Android ends up using its old one anyway, the net result will be as though
> > we haven't done anything useful.
>
> Yes. There is no point unless we can meet somewhere in the middle. I think
> that would have to include a full suspend that freezes all processes.
> Solutions which reject that - while quite clever - would require too much
> change to Android user-space to be acceptable.

Moreover, having thought a bit more about the "power manager in user space"
concept I'm not sure if it really is that better than the original wakelocks
idea. Namely, it only repaces a kernel-based mechanism with a user space
task doing basically the same thing, but the communication between that task
and the other cooperating user space tasks is arguably more complicated (it
also uses the kernel resources, although indirectly).

So, for a phone-like system, where you'd generally want to simplify user space,
having a "power manager" in the kernel seems to make sense to me.

Rafael

2010-06-10 12:01:00

by NeilBrown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 10 Jun 2010 10:59:43 +0200
"Rafael J. Wysocki" <[email protected]> wrote:

> On Thursday, June 10, 2010, Neil Brown wrote:
> > On Wed, 9 Jun 2010 11:40:27 +0200
> > "Rafael J. Wysocki" <[email protected]> wrote:
> >
> > > On Wednesday 09 June 2010, Felipe Contreras wrote:
> > > > On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> > > > <[email protected]> wrote:
> > > > > On Tue, 8 Jun 2010, [email protected] wrote:
> > > > >>
> > > > >> having suspend blockers inside the kernel adds significant complexity, it's
> > > > >> worth it only if the complexity buys you enough. In this case the question is
> > > > >> if the suspend blockers would extend the sleep time enough more to matter. As
> > > > >> per my other e-mail, this is an area with rapidly diminishing returns as the
> > > > >> sleep times get longer.
> > > > >
> > > > > Well, the counter-argument that nobody seems to have brought up is that
> > > > > suspend blockers exist, are real code, and end up being shipped in a lot
> > > > > of machines.
> > > > >
> > > > > That's a _big_ argument in favour of them. Certainly much bigger than
> > > > > arguing against them based on some complexity-arguments for an alternative
> > > > > that hasn't seen any testing at all.
> > > > >
> > > > > IOW, I would seriously hope that this discussion was more about real code
> > > > > that _exists_ and does what people need. It seems to have degenerated into
> > > > > something else.
> > > > >
> > > > > Because in the end, "code talks, bullshit walks". People can complain and
> > > > > suggest alternatives all they want, but you can't just argue. At some
> > > > > point you need to show the code that actually solves the problem.
> > > >
> > > > That's assuming there is an actual problem, which according to all the
> > > > embedded people except android, there is not.
> > >
> > > Yes, there is, but they've decided to ignore it.
> > >
> > > > And if there is indeed such a problem (probably not big), it might be
> > > > solved properly by the time suspend blockers are merged, or few
> > > > releases after.
> > >
> > > Not quite. Have you followed all of the discussion, actually?
> > >
> > > > Whatever the solution (or workaround) is, it would be nice if it could
> > > > be used by more than just android people, and it would also be nice to
> > > > do it without introducing user-space API that *nobody* likes and might
> > > > be quickly deprecated.
> > >
> > > I agree with Linus and I don't have that much of a problem with the API that
> > > people seem to have. In fact the much-hated user space API is just a char
> > > device driver with 3 ioctls (that can be extended in future if need be) and
> > > the kernel API is acceptable to me.
> >
> > I think there is a little bit more to it than that. It seems there is a new
> > ioctl for input/event devices to say "Any events queued here should be
> > treated as wake-up events". There may be similar additions to other devices,
> > but I know of no details.
>
> That's not in the patchset as in my pull request.
>
> It is used on Android, though, and it would have been submitted separately, had
> the first patchset been merged.

Very true. But as the one cannot be used without the other, they really need
to be considered as a package.


>
> > I wonder if we can get a complete statement of changes to the user-space
> > API...
> >
> > > Yes, there is some overlap between it
> > > and PM QoS, but IMhO that overlap may be reduced over time (eg. by
> > > using PM QoS requirements to implement suspend blockers).
> > >
> > > To me, the question boils down to whether or not we're able to persuade the
> > > Android people to use any other approach (eg. by demonstrating that something
> > > else is actually better), because even if we invent a brilliant new approach,
> > > but Android ends up using its old one anyway, the net result will be as though
> > > we haven't done anything useful.
> >
> > Yes. There is no point unless we can meet somewhere in the middle. I think
> > that would have to include a full suspend that freezes all processes.
> > Solutions which reject that - while quite clever - would require too much
> > change to Android user-space to be acceptable.
>
> Moreover, having thought a bit more about the "power manager in user space"
> concept I'm not sure if it really is that better than the original wakelocks
> idea. Namely, it only repaces a kernel-based mechanism with a user space
> task doing basically the same thing, but the communication between that task
> and the other cooperating user space tasks is arguably more complicated (it
> also uses the kernel resources, although indirectly).
>
> So, for a phone-like system, where you'd generally want to simplify user space,
> having a "power manager" in the kernel seems to make sense to me.
>

Following that logic would we end up putting everything in the kernel?

To my mind the advantage of having something in user-space is flexibility -
you can refine the interfaces and behaviours without bothering the kernel.
The reasons for putting things in the kernel are:
- tight integration with VM or processes (the two main abstractions that the
kernel has to manage)
- privileged access to devices
- arbitration between processes with different privilege levels

Need-for-speed, on the other hand, is not necessarily a justification for
going in the kernel - experience shows that we can make user-space
interactions quite fast enough.


The "power manager" itself is quite trivial. It just needs something like:
while true
wait for all wake-locks to be dropped
activate suspend
and for 'activate suspend' to be interruptible by something taking out a
wake-lock.

To my mind, the most interesting part of this is interruptibility. There is
essentially only one way to do that in Unix/Linux: signals.
So it makes sense to have a task - with a pid - performing this loop so that
it can be sent a signal.
And we already have mechanisms for sending signals on all sorts of different
events, both kernel events (fcntl(F_OWNER)), file events (FNOTIFY), and
normal process operations (kill).

If you are going to have a process (which I think you should) it makes sense
for it to spend some time in user-space to use ipc-of-your-choice to
communicate with other processes, and spend some time in kernel-space to be
able to wait for in-kernel locks to be dropped, and to use signals to get out
of kernel-space, and system calls to get into kernel space.

Yes, the proposed user-space interface for suspend blocks may not be all that
horrible. But the fact is that you can get all the required functionality by
just using currently existing interfaces. Against that background, adding
anything new should be avoided.

NeilBrown

2010-06-10 13:58:51

by Mark Brown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, Jun 10, 2010 at 10:59:43AM +0200, Rafael J. Wysocki wrote:

> So, for a phone-like system, where you'd generally want to simplify user space,
> having a "power manager" in the kernel seems to make sense to me.

I'm not clear where this requirement to simplify user space specifically
for phones comes from - phones do have pretty substantial software
stacks that aren't that far away to PCs.

2010-06-10 13:59:03

by Pavel Machek

[permalink] [raw]
Subject: Re: suspend blockers & Android integration

Hi!

> >> We started here because it's possibly the only api level change we have --
> >> almost everything else is driver or subarch type work or controversial but
> >> entirely self-contained (like the binder, which I would be shocked to see
> >> ever hit mainline). [...]
> >
> > So why arent those bits mainline? It's a 1000 times easier to get drivers and
> > small improvements and non-ABI changes upstream.
> >
> > After basically two years of growing your fork (and some attempts to get your
> > drivers into drivers/staging/ - from where they have meanwhile dropped out
> > again) you re-started with the worst possible thing to merge: a big and
> > difficult kernel feature affecting many subsystems. Why?
>
> Because a large number of our drivers depend on it.

The dependencies are trivial. Last time I checked, you had about
150KLoC drivers, with about 100 lines depending on wakelock support --
I know, I cleaned it up from staging.

The changes required for merging 150KLoC will be definitely much
bigger than 100 lines...

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2010-06-10 14:28:50

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, 9 Jun 2010, Arve Hj?nnev?g wrote:

> > I think this is where you misunderstood. ?There is no "report wakeup
> > event" as such. ?All that happens is that data becomes available to be
> > read from the input device file. ?However the power manager process
> > isn't polling the device file at this point (because a suspend blocker
> > is active), so it doesn't realize that the source has become active
> > again.
> >
>
> Yes this is not what I though you were suggesting. I thought you were
> trying to make sure the power manager sees all wakeup events. If you
> are only listening for wakeup events while no suspend blockers are
> active, why latch them?

...

> If you only poll the fd after the last user-space suspend blocker is
> released, why do you care when the kernel wakelock could have been
> released? It seem the only thing it saves you is an extra poll call
> when two wakeup events happen at the same time and one of them is
> fully processed and unblocks suspend before the other event handler
> blocks suspend. It seems strange to remove your wakeup event from the
> list when a specific suspend blocker is acquired when any suspend
> blocker will prevent that wakeup event from being added to the list in
> the first place.

...

> I don't think there is a need to tie the fds to anything else. If you
> poll the fds on the last suspend unblock call, you should get the same
> behaviour.

You are quite right; there is no need to associate suspend blockers
with wakeup sources. The power manager merely needs to poll all wakeup
sources whenever no suspend blockers are active. I put those
associations in the proposal because of the line of reasoning that led
up to it, but they aren't necessary.


> All input events that can wake the system are handled by one
> user-space suspend blocker. Input devices come and go so we would need
> to add and remove the fds dynamically.

Correct; the power manager would need to know whenever a wakeup-capable
device file was opened or closed.


> > The power manager would indeed have to know about wakeup devices that
> > don't need to _keep_ the system awake. ?Here's one way to cope: During
> > those times when no suspend blockers are active but the PM process
> > thinks a wakeup source is active, the PM process could poll every few
> > seconds to update its list of active sources. ?At those points it could
> > remove wakeup sources that have timed out.
> >
>
> For that to work the wakeup events would have to be reported to the
> power manager in a reliable way in the first place. Passing the file
> descriptor that the app uses to the power manager does not work for
> this, since the app could read the event while the power manager was
> not in the poll call and the power manager would never see it.

If the app activates a suspend blocker before reading the event, this
doesn't matter. If the app doesn't activate a suspend blocker then it
risks being suspended after it has read the event but before it has
handled the event. This is equally true with wakelocks.

> Also,
> existing apps don't pass their file descriptors to the power manager,
> so it has the get the event from somewhere else.

Now you've put your finger on the key. The main difference between my
scheme and the original wakelock scheme is that programs have to inform
the power manager whenever they open or close a wakeup-capable device
file. With everything implemented inside the kernel this isn't
necessary, because obviously a kernel driver already knows when its
device file is opened or closed.

As I said before, this additional complication in userspace is the
price paid for keeping stuff out of the kernel. If you had implemented
wakelocks this way originally, you would not have needed to patch the
vanilla kernel and this entire stormy discussion would never have
occurred. (But of course you couldn't have used this for the original
implementation of wakelocks, because back then the hardware couldn't
achieve the lowest power states from idle.)

> > Obviously this proposal would complicate your userspace. ?Not
> > enormously, since most of the work is confined to the power manager,
>
> No, the main problem it that it is not confined to the power manager.
> The power manager does not have a list of file descriptors to monitor,
> so we have to modify all code that handles wakeup events. This
> includes vendor supplied code that we don't have the source for, but,
> on some platforms at least, this code relies on kernel wakelocks with
> a timeout and could at first be handled the same way we would have to
> handle existing apps reading from a socket.

Look, I never said this scheme was _better_ than wakelocks. I merely
said that it could be used without significant modifications to the
kernel.

> The suspend blocker approach is more generally useful since it
> supports hardware where suspend is needed. Why this argument is being
> ignored is very puzzling.

Probably because people doesn't envision system suspend being used for
dynamic power management on that kind of hardware.

> Your solution is not immediately useful since depends on removing all
> periodic kernel timers and adding support to stopping the monotonic
> clock for set of processes that are frozen.

The monotonic clock is indeed an issue.

Periodic kernel timers... I don't know about them. Has anyone tried to
measure them? There are excellent reasons for trimming them when the
system goes into low-power idle, independent of Android. This sounds
like the kind of thing Arjan might have worked on.

> Your solution forces the user space interface to be more complicated,
> and it creates a new user space power manager that has to start before
> any process that handle wakeup events.

That's right.

Alan Stern

2010-06-10 14:39:55

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Wed, 9 Jun 2010 [email protected] wrote:

> why could the suspend blocker process see all events, but the power
> manager process not see the events?
>
> have the userspace talk to the power manager the way it does to the
> suspend blocker now and what's the difference?
>
> effectivly think s/suspend blocker/power manager/ (with the power manager
> doing all the other things that are proposed instead of grabbing the
> wakelock), the difference should be hidden to the rest of userspace.
>
> what am I missing here?

The main difference is that with a userspace power manager, programs
have to tell the power manager whenever they open or close a
wakeup-capable device file (and send it a copy of the file descriptor).
With an in-kernel implementation these extra steps aren't needed,
because of course kernel drivers already know when their device files
are opened or closed.

Alan Stern

2010-06-10 14:47:33

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 10 Jun 2010, Rafael J. Wysocki wrote:

> Moreover, having thought a bit more about the "power manager in user space"
> concept I'm not sure if it really is that better than the original wakelocks
> idea. Namely, it only repaces a kernel-based mechanism with a user space
> task doing basically the same thing, but the communication between that task
> and the other cooperating user space tasks is arguably more complicated (it
> also uses the kernel resources, although indirectly).

That is all true. The "power manager in userspace" was meant to prove
a point: that this _could_ be done without invasive changes to the
kernel. It wasn't necessarily meant to be a _better_ solution.

> So, for a phone-like system, where you'd generally want to simplify user space,
> having a "power manager" in the kernel seems to make sense to me.

This is a judgment call. Obviously different people have different
opinions.

Alan Stern

2010-06-10 15:46:30

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thursday, June 10, 2010, Alan Stern wrote:
> On Thu, 10 Jun 2010, Rafael J. Wysocki wrote:
>
> > Moreover, having thought a bit more about the "power manager in user space"
> > concept I'm not sure if it really is that better than the original wakelocks
> > idea. Namely, it only repaces a kernel-based mechanism with a user space
> > task doing basically the same thing, but the communication between that task
> > and the other cooperating user space tasks is arguably more complicated (it
> > also uses the kernel resources, although indirectly).
>
> That is all true. The "power manager in userspace" was meant to prove
> a point: that this _could_ be done without invasive changes to the
> kernel. It wasn't necessarily meant to be a _better_ solution.
>
> > So, for a phone-like system, where you'd generally want to simplify user space,
> > having a "power manager" in the kernel seems to make sense to me.
>
> This is a judgment call. Obviously different people have different
> opinions.

Agreed.

Rafael

2010-06-10 15:48:21

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thursday, June 10, 2010, Mark Brown wrote:
> On Thu, Jun 10, 2010 at 10:59:43AM +0200, Rafael J. Wysocki wrote:
>
> > So, for a phone-like system, where you'd generally want to simplify user space,
> > having a "power manager" in the kernel seems to make sense to me.
>
> I'm not clear where this requirement to simplify user space specifically
> for phones comes from

This isn't a requirement, but something that IMO is reasonable.

> - phones do have pretty substantial software stacks that aren't that far away
> to PCs.

That doesn't seem to be relevant here.

Rafael

2010-06-10 16:08:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thursday, June 10, 2010, Neil Brown wrote:
> On Thu, 10 Jun 2010 10:59:43 +0200
> "Rafael J. Wysocki" <[email protected]> wrote:
>
> > On Thursday, June 10, 2010, Neil Brown wrote:
> > > On Wed, 9 Jun 2010 11:40:27 +0200
> > > "Rafael J. Wysocki" <[email protected]> wrote:
> > >
> > > > On Wednesday 09 June 2010, Felipe Contreras wrote:
> > > > > On Wed, Jun 9, 2010 at 6:46 AM, Linus Torvalds
> > > > > <[email protected]> wrote:
> > > > > > On Tue, 8 Jun 2010, [email protected] wrote:
> > > > > >>
> > > > > >> having suspend blockers inside the kernel adds significant complexity, it's
> > > > > >> worth it only if the complexity buys you enough. In this case the question is
> > > > > >> if the suspend blockers would extend the sleep time enough more to matter. As
> > > > > >> per my other e-mail, this is an area with rapidly diminishing returns as the
> > > > > >> sleep times get longer.
> > > > > >
> > > > > > Well, the counter-argument that nobody seems to have brought up is that
> > > > > > suspend blockers exist, are real code, and end up being shipped in a lot
> > > > > > of machines.
> > > > > >
> > > > > > That's a _big_ argument in favour of them. Certainly much bigger than
> > > > > > arguing against them based on some complexity-arguments for an alternative
> > > > > > that hasn't seen any testing at all.
> > > > > >
> > > > > > IOW, I would seriously hope that this discussion was more about real code
> > > > > > that _exists_ and does what people need. It seems to have degenerated into
> > > > > > something else.
> > > > > >
> > > > > > Because in the end, "code talks, bullshit walks". People can complain and
> > > > > > suggest alternatives all they want, but you can't just argue. At some
> > > > > > point you need to show the code that actually solves the problem.
> > > > >
> > > > > That's assuming there is an actual problem, which according to all the
> > > > > embedded people except android, there is not.
> > > >
> > > > Yes, there is, but they've decided to ignore it.
> > > >
> > > > > And if there is indeed such a problem (probably not big), it might be
> > > > > solved properly by the time suspend blockers are merged, or few
> > > > > releases after.
> > > >
> > > > Not quite. Have you followed all of the discussion, actually?
> > > >
> > > > > Whatever the solution (or workaround) is, it would be nice if it could
> > > > > be used by more than just android people, and it would also be nice to
> > > > > do it without introducing user-space API that *nobody* likes and might
> > > > > be quickly deprecated.
> > > >
> > > > I agree with Linus and I don't have that much of a problem with the API that
> > > > people seem to have. In fact the much-hated user space API is just a char
> > > > device driver with 3 ioctls (that can be extended in future if need be) and
> > > > the kernel API is acceptable to me.
> > >
> > > I think there is a little bit more to it than that. It seems there is a new
> > > ioctl for input/event devices to say "Any events queued here should be
> > > treated as wake-up events". There may be similar additions to other devices,
> > > but I know of no details.
> >
> > That's not in the patchset as in my pull request.
> >
> > It is used on Android, though, and it would have been submitted separately, had
> > the first patchset been merged.
>
> Very true. But as the one cannot be used without the other, they really need
> to be considered as a package.

You can use suspend blockers as in the pull request without the input patch
in principle.

> >
> > > I wonder if we can get a complete statement of changes to the user-space
> > > API...
> > >
> > > > Yes, there is some overlap between it
> > > > and PM QoS, but IMhO that overlap may be reduced over time (eg. by
> > > > using PM QoS requirements to implement suspend blockers).
> > > >
> > > > To me, the question boils down to whether or not we're able to persuade the
> > > > Android people to use any other approach (eg. by demonstrating that something
> > > > else is actually better), because even if we invent a brilliant new approach,
> > > > but Android ends up using its old one anyway, the net result will be as though
> > > > we haven't done anything useful.
> > >
> > > Yes. There is no point unless we can meet somewhere in the middle. I think
> > > that would have to include a full suspend that freezes all processes.
> > > Solutions which reject that - while quite clever - would require too much
> > > change to Android user-space to be acceptable.
> >
> > Moreover, having thought a bit more about the "power manager in user space"
> > concept I'm not sure if it really is that better than the original wakelocks
> > idea. Namely, it only repaces a kernel-based mechanism with a user space
> > task doing basically the same thing, but the communication between that task
> > and the other cooperating user space tasks is arguably more complicated (it
> > also uses the kernel resources, although indirectly).
> >
> > So, for a phone-like system, where you'd generally want to simplify user space,
> > having a "power manager" in the kernel seems to make sense to me.
> >
>
> Following that logic would we end up putting everything in the kernel?

No, I don't think so.

> To my mind the advantage of having something in user-space is flexibility -
> you can refine the interfaces and behaviours without bothering the kernel.
> The reasons for putting things in the kernel are:
> - tight integration with VM or processes (the two main abstractions that the
> kernel has to manage)
> - privileged access to devices
> - arbitration between processes with different privilege levels

I'd add inter-process communication to this list as well.

> Need-for-speed, on the other hand, is not necessarily a justification for
> going in the kernel - experience shows that we can make user-space
> interactions quite fast enough.

Still, if a kernel-based approach ends up being simpler than a user-space-only
alternative, I tend to prefer the former. And please remember that in this
particular case all user-space-only alternatives in fact _have_ _to_ use
facilities provided by the kernel anyway, although they can use the existing
ones.

> The "power manager" itself is quite trivial. It just needs something like:
> while true
> wait for all wake-locks to be dropped
> activate suspend
> and for 'activate suspend' to be interruptible by something taking out a
> wake-lock.

Where the last thing is kind of complicated.

> To my mind, the most interesting part of this is interruptibility. There is
> essentially only one way to do that in Unix/Linux: signals.

Well, I don't follow. :-)

> So it makes sense to have a task - with a pid - performing this loop so that
> it can be sent a signal.
> And we already have mechanisms for sending signals on all sorts of different
> events, both kernel events (fcntl(F_OWNER)), file events (FNOTIFY), and
> normal process operations (kill).
>
> If you are going to have a process (which I think you should)

There's no such process in the original patchset and it seemed to get away with
that.

> it makes sense for it to spend some time in user-space to use ipc-of-your-choice to
> communicate with other processes, and spend some time in kernel-space to be
> able to wait for in-kernel locks to be dropped, and to use signals to get out
> of kernel-space, and system calls to get into kernel space.
>
> Yes, the proposed user-space interface for suspend blocks may not be all that
> horrible. But the fact is that you can get all the required functionality by
> just using currently existing interfaces. Against that background, adding
> anything new should be avoided.

Again, if things can be simplified overall by adding something new to the
kernel, I don't see a problem with that in principle. Especially if that new
stuff doesn't really interfere with anything already existing.

There also is the "real code" argument you seem to totally neglect. Whatever
you suggest, you should actually implement it and show that it works and that
it really is better than the proposed approach. Till then, it's just pure
speculation.

So there's real code that I don't really have a problem with and I haven't seen
any alternative implemented and working yet.

Rafael

2010-06-10 19:01:41

by Mark Brown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, Jun 10, 2010 at 05:46:46PM +0200, Rafael J. Wysocki wrote:
> On Thursday, June 10, 2010, Mark Brown wrote:
> > On Thu, Jun 10, 2010 at 10:59:43AM +0200, Rafael J. Wysocki wrote:

> > > So, for a phone-like system, where you'd generally want to simplify user space,
> > > having a "power manager" in the kernel seems to make sense to me.

> > I'm not clear where this requirement to simplify user space specifically
> > for phones comes from

> This isn't a requirement, but something that IMO is reasonable.

> > - phones do have pretty substantial software stacks that aren't that far away
> > to PCs.

> That doesn't seem to be relevant here.

Sure, that's my point - you seem to be suggesting that phones have a
different requirement here.

2010-06-10 23:02:37

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/10 Alan Stern <[email protected]>:
> On Wed, 9 Jun 2010, Arve Hj?nnev?g wrote:
>
>> > I think this is where you misunderstood. ?There is no "report wakeup
>> > event" as such. ?All that happens is that data becomes available to be
>> > read from the input device file. ?However the power manager process
>> > isn't polling the device file at this point (because a suspend blocker
>> > is active), so it doesn't realize that the source has become active
>> > again.
>> >
>>
>> Yes this is not what I though you were suggesting. I thought you were
>> trying to make sure the power manager sees all wakeup events. If you
>> are only listening for wakeup events while no suspend blockers are
>> active, why latch them?
>
> ...
>
>> If you only poll the fd after the last user-space suspend blocker is
>> released, why do you care when the kernel wakelock could have been
>> released? It seem the only thing it saves you is an extra poll call
>> when two wakeup events happen at the same time and one of them is
>> fully processed and unblocks suspend before the other event handler
>> blocks suspend. It seems strange to remove your wakeup event from the
>> list when a specific suspend blocker is acquired when any suspend
>> blocker will prevent that wakeup event from being added to the list in
>> the first place.
>
> ...
>
>> I don't think there is a need to tie the fds to anything else. If you
>> poll the fds on the last suspend unblock call, you should get the same
>> behaviour.
>
> You are quite right; there is no need to associate suspend blockers
> with wakeup sources. ?The power manager merely needs to poll all wakeup
> sources whenever no suspend blockers are active. ?I put those
> associations in the proposal because of the line of reasoning that led
> up to it, but they aren't necessary.
>
>
>> All input events that can wake the system are handled by one
>> user-space suspend blocker. Input devices come and go so we would need
>> to add and remove the fds dynamically.
>
> Correct; the power manager would need to know whenever a wakeup-capable
> device file was opened or closed.
>
>
>> > The power manager would indeed have to know about wakeup devices that
>> > don't need to _keep_ the system awake. ?Here's one way to cope: During
>> > those times when no suspend blockers are active but the PM process
>> > thinks a wakeup source is active, the PM process could poll every few
>> > seconds to update its list of active sources. ?At those points it could
>> > remove wakeup sources that have timed out.
>> >
>>
>> For that to work the wakeup events would have to be reported to the
>> power manager in a reliable way in the first place. Passing the file
>> descriptor that the app uses to the power manager does not work for
>> this, since the app could read the event while the power manager was
>> not in the poll call and the power manager would never see it.
>
> If the app activates a suspend blocker before reading the event, this
> doesn't matter. ?If the app doesn't activate a suspend blocker then it
> risks being suspended after it has read the event but before it has
> handled the event. ?This is equally true with wakelocks.
>

It is not the same. Using a wakelock with a timeout only has a problem
if the app did not get a change to run and block suspend before the
timeout expires. With the timeout values we use there is only a
problem if the system is already unresponsive. If the driver does not
block suspend but instead a power manager calls select or poll on a
file descriptor while the app does a blocking read, the power manager
can easily miss the event and suspend before the app blocks suspend.

>> Also,
>> existing apps don't pass their file descriptors to the power manager,
>> so it has the get the event from somewhere else.
>
> Now you've put your finger on the key. ?The main difference between my
> scheme and the original wakelock scheme is that programs have to inform
> the power manager whenever they open or close a wakeup-capable device
> file. ?With everything implemented inside the kernel this isn't
> necessary, because obviously a kernel driver already knows when its
> device file is opened or closed.
>
> As I said before, this additional complication in userspace is the
> price paid for keeping stuff out of the kernel. ?If you had implemented
> wakelocks this way originally, you would not have needed to patch the
> vanilla kernel and this entire stormy discussion would never have
> occurred. ?(But of course you couldn't have used this for the original
> implementation of wakelocks, because back then the hardware couldn't
> achieve the lowest power states from idle.)
>
>> > Obviously this proposal would complicate your userspace. ?Not
>> > enormously, since most of the work is confined to the power manager,
>>
>> No, the main problem it that it is not confined to the power manager.
>> The power manager does not have a list of file descriptors to monitor,
>> so we have to modify all code that handles wakeup events. This
>> includes vendor supplied code that we don't have the source for, but,
>> on some platforms at least, this code relies on kernel wakelocks with
>> a timeout and could at first be handled the same way we would have to
>> handle existing apps reading from a socket.
>
> Look, I never said this scheme was _better_ than wakelocks. ?I merely
> said that it could be used without significant modifications to the
> kernel.
>
>> The suspend blocker approach is more generally useful since it
>> supports hardware where suspend is needed. Why this argument is being
>> ignored is very puzzling.
>
> Probably because people doesn't envision system suspend being used for
> dynamic power management on that kind of hardware.
>

I'm not sure what you mean by dynamic power management here (frequency
of suspends?), but auto suspend is already in use on x86 desktops and
laptops. Suspend blockers can fix the race with some wakeup events
there.

>> Your solution is not immediately useful since depends on removing all
>> periodic kernel timers and adding support to stopping the monotonic
>> clock for set of processes that are frozen.
>
> The monotonic clock is indeed an issue.
>
> Periodic kernel timers... I don't know about them. ?Has anyone tried to
> measure them? ?There are excellent reasons for trimming them when the
> system goes into low-power idle, independent of Android. ?This sounds
> like the kind of thing Arjan might have worked on.
>
>> Your solution forces the user space interface to be more complicated,
>> and it creates a new user space power manager that has to start before
>> any process that handle wakeup events.
>
> That's right.
>
> Alan Stern
>
>



--
Arve Hj?nnev?g

2010-06-11 01:44:10

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 10 Jun 2010, Arve Hj?nnev?g wrote:

> >> For that to work the wakeup events would have to be reported to the
> >> power manager in a reliable way in the first place. Passing the file
> >> descriptor that the app uses to the power manager does not work for
> >> this, since the app could read the event while the power manager was
> >> not in the poll call and the power manager would never see it.
> >
> > If the app activates a suspend blocker before reading the event, this
> > doesn't matter. ?If the app doesn't activate a suspend blocker then it
> > risks being suspended after it has read the event but before it has
> > handled the event. ?This is equally true with wakelocks.
> >
>
> It is not the same. Using a wakelock with a timeout only has a problem
> if the app did not get a change to run and block suspend before the
> timeout expires. With the timeout values we use there is only a
> problem if the system is already unresponsive. If the driver does not
> block suspend but instead a power manager calls select or poll on a
> file descriptor while the app does a blocking read, the power manager
> can easily miss the event and suspend before the app blocks suspend.

You've lost me. If the power manager is sitting inside a select/poll,
how can it miss the event (given that the event will make data
available to be read on one of the descriptors being polled)?

Or put it another way: With wakelocks, if the app doesn't use a suspend
blocker then once it reads the event data and the timed wakelock is
deactivated, there is nothing to prevent the system from immediately
going into opportunistic suspend. My scheme can fail in the same way.
Is that what you meant?

> >> The suspend blocker approach is more generally useful since it
> >> supports hardware where suspend is needed. Why this argument is being
> >> ignored is very puzzling.
> >
> > Probably because people doesn't envision system suspend being used for
> > dynamic power management on that kind of hardware.
> >
>
> I'm not sure what you mean by dynamic power management here (frequency
> of suspends?), but auto suspend is already in use on x86 desktops and
> laptops. Suspend blockers can fix the race with some wakeup events
> there.

You should stress this point more strongly when conversing with others.
I doubt it will be enough to change anybody's mind, but it can't hurt.
Indeed, if you propose suspend blockers as a way to fix a lost-wakeup
bug in existing distributions, rather than as something needed to
support Android, people might view it more favorably.


There's one question that I don't remember ever seeing answered. To
which kernel drivers do you intend to add suspend blockers?

Alan Stern

2010-06-11 03:16:48

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/10 Alan Stern <[email protected]>:
> On Thu, 10 Jun 2010, Arve Hj?nnev?g wrote:
>
>> >> For that to work the wakeup events would have to be reported to the
>> >> power manager in a reliable way in the first place. Passing the file
>> >> descriptor that the app uses to the power manager does not work for
>> >> this, since the app could read the event while the power manager was
>> >> not in the poll call and the power manager would never see it.
>> >
>> > If the app activates a suspend blocker before reading the event, this
>> > doesn't matter. ?If the app doesn't activate a suspend blocker then it
>> > risks being suspended after it has read the event but before it has
>> > handled the event. ?This is equally true with wakelocks.
>> >
>>
>> It is not the same. Using a wakelock with a timeout only has a problem
>> if the app did not get a change to run and block suspend before the
>> timeout expires. With the timeout values we use there is only a
>> problem if the system is already unresponsive. If the driver does not
>> block suspend but instead a power manager calls select or poll on a
>> file descriptor while the app does a blocking read, the power manager
>> can easily miss the event and suspend before the app blocks suspend.
>
> You've lost me. ?If the power manager is sitting inside a select/poll,
> how can it miss the event (given that the event will make data
> available to be read on one of the descriptors being polled)?
>

It cannot sit inside of select/poll all the time.

> Or put it another way: With wakelocks, if the app doesn't use a suspend
> blocker then once it reads the event data and the timed wakelock is
> deactivated, there is nothing to prevent the system from immediately
> going into opportunistic suspend. ?My scheme can fail in the same way.
> Is that what you meant?
>

No, if an app reads from a file descriptor and block suspend when the
read call returns, then suspend is blocked while processing the data.
If the driver uses a wakelock with a timeout this will fail if the
thread does not get to the suspend block call before the timeout
expires, but unrelated events that don't prevent the app from running
will not cause any problems. In your scheme the user-space power
manager may miss events on this file descriptor since select/poll will
not see an event if the app read that event right before the power
manager called select/poll.

>> >> The suspend blocker approach is more generally useful since it
>> >> supports hardware where suspend is needed. Why this argument is being
>> >> ignored is very puzzling.
>> >
>> > Probably because people doesn't envision system suspend being used for
>> > dynamic power management on that kind of hardware.
>> >
>>
>> I'm not sure what you mean by dynamic power management here (frequency
>> of suspends?), but auto suspend is already in use on x86 desktops and
>> laptops. Suspend blockers can fix the race with some wakeup events
>> there.
>
> You should stress this point more strongly when conversing with others.
> I doubt it will be enough to change anybody's mind, but it can't hurt.
> Indeed, if you propose suspend blockers as a way to fix a lost-wakeup
> bug in existing distributions, rather than as something needed to
> support Android, people might view it more favorably.
>
>
> There's one question that I don't remember ever seeing answered. ?To
> which kernel drivers do you intend to add suspend blockers?
>

All drivers that generate wakeup events need to either use suspend
blockers directly or call into something else that does. For instance,
with the patch to block suspend while input events are queued to
user-space, an input driver that fully handles its events in its
interrupt handler does not need any additional suspend blockers, but
if the driver needs a work function or a timer to run before it
reports the event it needs to block suspend until it has reported the
event.

--
Arve Hj?nnev?g

2010-06-11 04:21:43

by David Brownell

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

This is a bit off the topic of Android
flamage, but I thought it would be worth
highlighting an example where the current
frameworks may still have a deficiency...
one that likewise relates to needing to
block entry ot a system suspend state, but
in this case user-space isn't very involved
(just drivers coping with hardware).

The example I wanted to re-post (I've done so
in the past) is one where drivers ouldn't really
do the right thing, since driver.suspend() wasn't
quite powerful enough as a programming interface.

The example works with USB on many ARM SoCs,
and similar non-USB examples aren't rare.

- Want to enter a system suspend state, with
some USB wakeup sources. USB peripheral
waken up by the host, or vice versa.

NOTE ASSUMPTION: there are multiple suspend
states supported by the hardware, significantly
different in hardware configuration Linux should
be able to use more than one such state... (if
only because their power savings differ.) This
can mean driver-specific knowledge about those
various states.

- The wakeup requires a particular clock to be active, so the USB controller can detect that the
wakeup should trigger, then issue the right signals
triggering the non-USB parts of the system.

Problem: how does the device driver suspend()
method block entry to a suspend state
when it can't ensure that clock is
going to be active. Magic return code?

There are other issues here too. (Is the target
system suspend state one of the ones which doesn't
allow that clock to be active? SoC-specific calls
might suffice for this issue.



A number of years ago, this problem was insoluble
with the then-current Linux PM and clock frameworks.
I've been away from this issue for quite a while
now, but don't recall seeing its sub-problems get
solved ... If they're now solved, I'll be glad.
(I know Kevin's recent OMAP stuff addresses similar
issues, but It's OMAP-specific...)

After all these thousands and thousands of emails...
I'm not sure how much forward motion has happened.

Do we at least have a clean way that a driver can
reject a system suspend? I've lost track of many
issues, but maybe this could be phrased as a QOS
constraint: the current config of driver X needs
clock Y active to enter the target system suspend
state, driver's suspend() method reports as much. Then the entry to that system state gets blocked
if the clock isn't enabled.

(That QOS constraint should be removed when that
driver no longer needs to issue wakeups; that's
not quite the same as "removed by driver.resume().



2010-06-11 14:28:32

by James Bottomley

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 2010-06-10 at 21:21 -0700, David Brownell wrote:
> Do we at least have a clean way that a driver can
> reject a system suspend? I've lost track of many
> issues, but maybe this could be phrased as a QOS
> constraint: the current config of driver X needs
> clock Y active to enter the target system suspend
> state, driver's suspend() method reports as much. Then the entry to
> that system state gets blocked
> if the clock isn't enabled.

So in QoS modifications to android patches, the answer is "yes" ...
except that the current android patch set didn't actually have this type
of wakelock in it.

Android wants an idleness suspend block (or pm qos constraint) that a
driver can set to prevent the system idleness power govenor from
dropping into a power state too low for the driver, so in USB terms this
would prevent the states that shut down the clock. For android, it
prevented shutdown of an internal i2c bus.

The one thing that does look difficult is that these power constraints
are device (and sometimes SoC) specific. Expressing them in a generic
way for the cpu govenors to make sense of might be hard.

> (That QOS constraint should be removed when that
> driver no longer needs to issue wakeups; that's
> not quite the same as "removed by driver.resume().

The USB one needs user input, doesn't it, since user hot plug might (or
might not) be one of the wakeup sources.

James

2010-06-11 14:33:33

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 10 Jun 2010, Arve Hj?nnev?g wrote:

> > You've lost me. ?If the power manager is sitting inside a select/poll,
> > how can it miss the event (given that the event will make data
> > available to be read on one of the descriptors being polled)?
> >
>
> It cannot sit inside of select/poll all the time.
>
> > Or put it another way: With wakelocks, if the app doesn't use a suspend
> > blocker then once it reads the event data and the timed wakelock is
> > deactivated, there is nothing to prevent the system from immediately
> > going into opportunistic suspend. ?My scheme can fail in the same way.
> > Is that what you meant?
> >
>
> No, if an app reads from a file descriptor and block suspend when the
> read call returns, then suspend is blocked while processing the data.
> If the driver uses a wakelock with a timeout this will fail if the
> thread does not get to the suspend block call before the timeout
> expires, but unrelated events that don't prevent the app from running
> will not cause any problems.

Wait a second. Maybe I have misunderstood how timeouts are supposed to
work with wakelocks. I thought the idea was that the wakelock would be
released when the timeout expires or the event queue is emptied,
whichever comes first. Now it sounds like you're saying that the
wakelock doesn't get released until the timeout expires, even if
userspace finishes processing all pending events before then.

> In your scheme the user-space power
> manager may miss events on this file descriptor since select/poll will
> not see an event if the app read that event right before the power
> manager called select/poll.

If the wakelock is supposed to remain active until the timeout expires
then you are right. On the other hand, this seems like a rather
strange and suspicious way of handling wakelocks. Why would you want
to do it that way?

> > There's one question that I don't remember ever seeing answered. ?To
> > which kernel drivers do you intend to add suspend blockers?
> >
>
> All drivers that generate wakeup events need to either use suspend
> blockers directly or call into something else that does. For instance,
> with the patch to block suspend while input events are queued to
> user-space, an input driver that fully handles its events in its
> interrupt handler does not need any additional suspend blockers, but
> if the driver needs a work function or a timer to run before it
> reports the event it needs to block suspend until it has reported the
> event.

Sure. But specifically, which drivers on Android generate wakeup
events? And which of them don't fully handle their events in their
interrupt handlers?

Maybe another way to put this is: Where in the kernel do you intend to
add suspend blockers?

Alan Stern

2010-06-11 14:42:11

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Thu, 10 Jun 2010, David Brownell wrote:

> This is a bit off the topic of Android
> flamage, but I thought it would be worth
> highlighting an example where the current
> frameworks may still have a deficiency...
> one that likewise relates to needing to
> block entry ot a system suspend state, but
> in this case user-space isn't very involved
> (just drivers coping with hardware).
>
> The example I wanted to re-post (I've done so
> in the past) is one where drivers ouldn't really
> do the right thing, since driver.suspend() wasn't
> quite powerful enough as a programming interface.
>
> The example works with USB on many ARM SoCs,
> and similar non-USB examples aren't rare.
>
> - Want to enter a system suspend state, with
> some USB wakeup sources. USB peripheral
> waken up by the host, or vice versa.
>
> NOTE ASSUMPTION: there are multiple suspend
> states supported by the hardware, significantly
> different in hardware configuration Linux should
> be able to use more than one such state... (if
> only because their power savings differ.) This
> can mean driver-specific knowledge about those
> various states.
>
> - The wakeup requires a particular clock to be active, so the USB controller can detect that the
> wakeup should trigger, then issue the right signals
> triggering the non-USB parts of the system.
>
> Problem: how does the device driver suspend()
> method block entry to a suspend state
> when it can't ensure that clock is
> going to be active. Magic return code?

At the moment, drivers aren't told what suspend state the system is
going into. They know the difference between suspend and hibernate,
but the PM core doesn't tell drivers whether it's going into standby
vs. suspend. (Strictly speaking, those terms apply mostly to ACPI
systems and not so much elsewhere. What I'm talking about is the
/sys/power/state interface, where the user can write either "standby"
or "mem". Drivers aren't told which was written.)

> There are other issues here too. (Is the target
> system suspend state one of the ones which doesn't
> allow that clock to be active? SoC-specific calls
> might suffice for this issue.

I imagine platforms have to answer all such questions when they decide
exactly how they will implement "standby" and "mem".

> A number of years ago, this problem was insoluble
> with the then-current Linux PM and clock frameworks.
> I've been away from this issue for quite a while
> now, but don't recall seeing its sub-problems get
> solved ... If they're now solved, I'll be glad.
> (I know Kevin's recent OMAP stuff addresses similar
> issues, but It's OMAP-specific...)
>
> After all these thousands and thousands of emails...
> I'm not sure how much forward motion has happened.

I'm not aware of any progress.

Alan Stern

2010-06-11 14:46:32

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, 11 Jun 2010, James Bottomley wrote:

> On Thu, 2010-06-10 at 21:21 -0700, David Brownell wrote:
> > Do we at least have a clean way that a driver can
> > reject a system suspend? I've lost track of many
> > issues, but maybe this could be phrased as a QOS
> > constraint: the current config of driver X needs
> > clock Y active to enter the target system suspend
> > state, driver's suspend() method reports as much. Then the entry to
> > that system state gets blocked
> > if the clock isn't enabled.
>
> So in QoS modifications to android patches, the answer is "yes" ...
> except that the current android patch set didn't actually have this type
> of wakelock in it.
>
> Android wants an idleness suspend block (or pm qos constraint) that a
> driver can set to prevent the system idleness power govenor from
> dropping into a power state too low for the driver, so in USB terms this
> would prevent the states that shut down the clock. For android, it
> prevented shutdown of an internal i2c bus.
>
> The one thing that does look difficult is that these power constraints
> are device (and sometimes SoC) specific. Expressing them in a generic
> way for the cpu govenors to make sense of might be hard.

Doesn't the clock framework already handle this sort of thing?

Alan Stern

2010-06-11 15:01:55

by Mark Brown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, Jun 11, 2010 at 10:46:27AM -0400, Alan Stern wrote:
> On Fri, 11 Jun 2010, James Bottomley wrote:

> > The one thing that does look difficult is that these power constraints
> > are device (and sometimes SoC) specific. Expressing them in a generic
> > way for the cpu govenors to make sense of might be hard.

> Doesn't the clock framework already handle this sort of thing?

The clock framework is implemented independantly for each CPU.

2010-06-11 15:02:32

by James Bottomley

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, 2010-06-11 at 10:46 -0400, Alan Stern wrote:
> On Fri, 11 Jun 2010, James Bottomley wrote:
>
> > On Thu, 2010-06-10 at 21:21 -0700, David Brownell wrote:
> > > Do we at least have a clean way that a driver can
> > > reject a system suspend? I've lost track of many
> > > issues, but maybe this could be phrased as a QOS
> > > constraint: the current config of driver X needs
> > > clock Y active to enter the target system suspend
> > > state, driver's suspend() method reports as much. Then the entry to
> > > that system state gets blocked
> > > if the clock isn't enabled.
> >
> > So in QoS modifications to android patches, the answer is "yes" ...
> > except that the current android patch set didn't actually have this type
> > of wakelock in it.
> >
> > Android wants an idleness suspend block (or pm qos constraint) that a
> > driver can set to prevent the system idleness power govenor from
> > dropping into a power state too low for the driver, so in USB terms this
> > would prevent the states that shut down the clock. For android, it
> > prevented shutdown of an internal i2c bus.
> >
> > The one thing that does look difficult is that these power constraints
> > are device (and sometimes SoC) specific. Expressing them in a generic
> > way for the cpu govenors to make sense of might be hard.
>
> Doesn't the clock framework already handle this sort of thing?

Well, there are two elements to "this sort of thing":

1. Allow a driver to request that a given clock not be turned off.
2. Make the cpuidle governors aware of a pending "don't turn off X
clock source" so they can keep the system in a state where the
clock doesn't get powered down.

As far as I can tell from the code, neither currently exists at the
moment.

James

2010-06-11 20:48:18

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, 11 Jun 2010, Mark Brown wrote:

> On Fri, Jun 11, 2010 at 10:46:27AM -0400, Alan Stern wrote:
> > On Fri, 11 Jun 2010, James Bottomley wrote:
>
> > > The one thing that does look difficult is that these power constraints
> > > are device (and sometimes SoC) specific. Expressing them in a generic
> > > way for the cpu govenors to make sense of might be hard.
>
> > Doesn't the clock framework already handle this sort of thing?
>
> The clock framework is implemented independantly for each CPU.

That's not an impediment, since drivers' requirements regarding which
clocks remain running in which power states are necessarily
platform-dependent also.


On Fri, 11 Jun 2010, James Bottomley wrote:

> Well, there are two elements to "this sort of thing":
>
> 1. Allow a driver to request that a given clock not be turned off.
> 2. Make the cpuidle governors aware of a pending "don't turn off X
> clock source" so they can keep the system in a state where the
> clock doesn't get powered down.
>
> As far as I can tell from the code, neither currently exists at the
> moment.

Well then, can (or should) the clock framework interact with the
pm-qos subsystem so that drivers don't have to worry about it?

Alan Stern

2010-06-11 21:04:47

by James Bottomley

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, 2010-06-11 at 16:48 -0400, Alan Stern wrote:
> On Fri, 11 Jun 2010, Mark Brown wrote:
>
> > On Fri, Jun 11, 2010 at 10:46:27AM -0400, Alan Stern wrote:
> > > On Fri, 11 Jun 2010, James Bottomley wrote:
> >
> > > > The one thing that does look difficult is that these power constraints
> > > > are device (and sometimes SoC) specific. Expressing them in a generic
> > > > way for the cpu govenors to make sense of might be hard.
> >
> > > Doesn't the clock framework already handle this sort of thing?
> >
> > The clock framework is implemented independantly for each CPU.
>
> That's not an impediment, since drivers' requirements regarding which
> clocks remain running in which power states are necessarily
> platform-dependent also.
>
>
> On Fri, 11 Jun 2010, James Bottomley wrote:
>
> > Well, there are two elements to "this sort of thing":
> >
> > 1. Allow a driver to request that a given clock not be turned off.
> > 2. Make the cpuidle governors aware of a pending "don't turn off X
> > clock source" so they can keep the system in a state where the
> > clock doesn't get powered down.
> >
> > As far as I can tell from the code, neither currently exists at the
> > moment.
>
> Well then, can (or should) the clock framework interact with the
> pm-qos subsystem so that drivers don't have to worry about it?

So the implementation of this seems to be a bit complex: We could have
clockevents_register() do a per clock pm_qos variable but then the
cpuidle governors need to know which to listen for so they don't
transition to a state too low for them to be active if pm_qos says keep
them running. Even if that gets sorted out, how would USB know which
platform specific clock source is driving the wakeup events on its bus?

How complex can SoC clocksources be? If it's just a simple binary
do/don't potentially stop all clocks, I think it's easy. If SoC's have
a hierarchical shutdown sequence, and they really need this to save
power, then expressing that generically becomes rather problematic.

James



2010-06-11 22:26:50

by Arve Hjønnevåg

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

2010/6/11 Alan Stern <[email protected]>:
> On Thu, 10 Jun 2010, Arve Hj?nnev?g wrote:
>
>> > You've lost me. ?If the power manager is sitting inside a select/poll,
>> > how can it miss the event (given that the event will make data
>> > available to be read on one of the descriptors being polled)?
>> >
>>
>> It cannot sit inside of select/poll all the time.
>>
>> > Or put it another way: With wakelocks, if the app doesn't use a suspend
>> > blocker then once it reads the event data and the timed wakelock is
>> > deactivated, there is nothing to prevent the system from immediately
>> > going into opportunistic suspend. ?My scheme can fail in the same way.
>> > Is that what you meant?
>> >
>>
>> No, if an app reads from a file descriptor and block suspend when the
>> read call returns, then suspend is blocked while processing the data.
>> If the driver uses a wakelock with a timeout this will fail if the
>> thread does not get to the suspend block call before the timeout
>> expires, but unrelated events that don't prevent the app from running
>> will not cause any problems.
>
> Wait a second. ?Maybe I have misunderstood how timeouts are supposed to
> work with wakelocks. ?I thought the idea was that the wakelock would be
> released when the timeout expires or the event queue is emptied,

That is one way to use it, and I did this so code that opened an input
device without reading from it would not prevent suspend forever. In
the last patchset I posted, I instead used an ioctl to enable the
suspend blocker.

> whichever comes first. ?Now it sounds like you're saying that the
> wakelock doesn't get released until the timeout expires, even if
> userspace finishes processing all pending events before then.
>

For incoming network traffic we use a wakelock with a timeout to
prevent suspend long enough for the data to make it to user-space
since we have not added wakelocks to the network stack.

>> In your scheme the user-space power
>> manager may miss events on this file descriptor since select/poll will
>> not see an event if the app read that event right before the power
>> manager called select/poll.
>
> If the wakelock is supposed to remain active until the timeout expires
> then you are right. ?On the other hand, this seems like a rather
> strange and suspicious way of handling wakelocks. ?Why would you want
> to do it that way?
>

We did this to avoid changing to the network stack, tty layer, etc.

>> > There's one question that I don't remember ever seeing answered. ?To
>> > which kernel drivers do you intend to add suspend blockers?
>> >
>>
>> All drivers that generate wakeup events need to either use suspend
>> blockers directly or call into something else that does. For instance,
>> with the patch to block suspend while input events are queued to
>> user-space, an input driver that fully handles its events in its
>> interrupt handler does not need any additional suspend blockers, but
>> if the driver needs a work function or a timer to run before it
>> reports the event it needs to block suspend until it has reported the
>> event.
>
> Sure. ?But specifically, which drivers on Android generate wakeup
> events? ?And which of them don't fully handle their events in their
> interrupt handlers?
>

Keypad, network, charger, rtc, but I'm sure I forgot some.

> Maybe another way to put this is: Where in the kernel do you intend to
> add suspend blockers?
>

In addition to the drivers that enable the wakeup events, we have
added suspend blockers to the input event code and power supply
framework. The tty layer and network stack would also need suspend
blockers to avoid using timeouts.

--
Arve Hj?nnev?g

2010-06-12 02:44:00

by David Brownell

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration



--- On Fri, 6/11/10, James Bottomley <[email protected]> wrote:

> > Do we at least have a clean way that a driver can
> > reject a system suspend?? I've lost track of
> many
> > issues, but maybe this could be phrased as a QOS
> > constraint:? the current config of driver X
> needs
> > clock Y active to enter the? target system
> suspend
> > state, driver's suspend() method reports as
> much.? Then the entry to
> > that system state gets blocked
> > if the clock isn't enabled.
>
> So in QoS modifications to android patches, the answer is "yes" ...
> except that the current android patch set didn't actually
> have this type
> of wakelock in it.

Except, we're not talking "wakelock" ... :)

So ... no, these cases still have no solution.
(I think that's at least five years now.)


> The one thing that does look difficult is that these power
> constraints
> are device (and sometimes SoC) specific.

Exactly why they make good examples for
turning up framework limitations... like
having overlooked constraints coming from
various SoC peripherals.

> Expressing them in a generic way
> for the cpu govenors to make sense of might be hard.

Requiring "CPU governors" to be involved in such
stuff feels a bit off-course to me. At least, if
the involvement is very deep. The constraints are
from the integrated peripherals, not (usually)
from the CPU

There are plent of places to hang SoC or
device specific data, once there's awareness
that without such data, (which PCs hide behind
ACPI bytecode) the PM framework is missing out
on support for some desirable low power modes.


>
> > (That QOS constraint should be removed when that
> > driver no longer needs to issue wakeups; that's
> > not quite the same as "removed by driver.resume().
>
> The USB one needs user input, doesn't it,

I don't quite see that. The drivers get called
in enough places, and they'll know if the system
is going to be in a suspend state where they need
to be partially
aactive (with various QOS constraints, or they
can't work.

> since user hotplug might (or
> might not) be one of the wakeup sources.

If it wakes via hotplug, then the driver must
already Do The Right Thing; what input would
be needed (Beyond changing the cable config)?

>
> James
>
>
> --

2010-06-12 15:22:23

by Alan Stern

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, 11 Jun 2010, Arve Hj?nnev?g wrote:

> > Wait a second. ?Maybe I have misunderstood how timeouts are supposed to
> > work with wakelocks. ?I thought the idea was that the wakelock would be
> > released when the timeout expires or the event queue is emptied,
>
> That is one way to use it, and I did this so code that opened an input
> device without reading from it would not prevent suspend forever. In
> the last patchset I posted, I instead used an ioctl to enable the
> suspend blocker.
>
> > whichever comes first. ?Now it sounds like you're saying that the
> > wakelock doesn't get released until the timeout expires, even if
> > userspace finishes processing all pending events before then.
> >
>
> For incoming network traffic we use a wakelock with a timeout to
> prevent suspend long enough for the data to make it to user-space
> since we have not added wakelocks to the network stack.

> We did this to avoid changing to the network stack, tty layer, etc.

I see. These are examples where wakelocks are _not_ released by any
userspace action, so they don't seem to fit well into my all-userspace
scheme. At least, not in their current form. On the other hand, if
the network/tty wakeup events eventually cause data to become available
on a socket or device file, then they wouldn't need any special
treatment in my scheme. The socket/file descriptors could be handled
the same as any others. (Although you might need to change some apps,
to make them follow the usual pattern of poll, activate suspend
blocker, read, process, release suspend blocker.)

In other words, the fact that everything has been moved into userspace
means that you wouldn't have to worry about the missing wakelocks in
the network stack or tty layer, and consequently wouldn't have to worry
about using timed wakelocks there.

> > Sure. ?But specifically, which drivers on Android generate wakeup
> > events? ?And which of them don't fully handle their events in their
> > interrupt handlers?
> >
>
> Keypad, network, charger, rtc, but I'm sure I forgot some.
>
> > Maybe another way to put this is: Where in the kernel do you intend to
> > add suspend blockers?
> >
>
> In addition to the drivers that enable the wakeup events, we have
> added suspend blockers to the input event code and power supply
> framework. The tty layer and network stack would also need suspend
> blockers to avoid using timeouts.

I see. The keypad, charger (power supply), and rtc drivers sound
pretty platform-specific. Probably nobody would complain too strongly
about adding suspend blockers there.

The input, network, and tty layers are more general, though. That's
where you're most likely to encounter resistance.

People have been complaining about "suspend blockers being added all
throughout the kernel". It might help if you pointed out that it's
just in these three layers (and maybe at only a few specific points
within each layer).

Alan Stern

2010-06-12 17:10:45

by Mark Brown

[permalink] [raw]
Subject: Re: [linux-pm] suspend blockers & Android integration

On Fri, Jun 11, 2010 at 04:48:15PM -0400, Alan Stern wrote:
> On Fri, 11 Jun 2010, Mark Brown wrote:

> > The clock framework is implemented independantly for each CPU.

> That's not an impediment, since drivers' requirements regarding which
> clocks remain running in which power states are necessarily
> platform-dependent also.

It does mean that you can't make any general statements about what the
clock framework does and doesn't do which is what the person I was
replying to (the quoting you did as you cut'n'pasted replies to multiple
messages seems more than a little confused, sorry) seemed to want.

2010-07-09 19:11:54

by Pavel Machek

[permalink] [raw]
Subject: HTC Dream drivers was Re: [linux-pm] suspend blockers & Android integration

Hi!


> > So for people who really care about running a mainline kernel on their
> > android device doing that part first on a generic ARM board in qemu
> > might be much better first step work.
>
> > On the other hand I've heard
> > that various hardware vendors or parties closed to them are rather
> > annoyed by their drivers beeing stuck in the android tree - but that
> > can be easily solved by getting removing the suspend blockers (at least
> > temporarily), cleaning up a few bits here and there and getting them in.
>
> This continues to baffle me. If we (Google) are such a headache, why
> not just route around us. The drivers we've written are GPLv2, the

Well, we did route around you, and that's why the HTC Dream drivers
are in
staging.

Unfortunately...

> source is out there for anyone who wants it, etc. The drivers other
> people have written we have no control over at all. From my point of
> view it'd be an annoyance if somebody took the code we wrote, modified
> it heavily, and pushed it upstream, but fundamentally I can't stop
> that from happening other than by pushing it upstream myself, first.

...you were calling bloody murder when we tried to name them right way
-- thats board_dream, not board_trout -- and AFAIK you still did not
switch to actually using those drivers so that diffs would grow
smaller...

And no, it is not 'just arm architecture #1536'; due to strange
baseband/cpu split in msm drivers end up quite big&complex...

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html