2014-02-20 13:24:04

by Compostella, Jeremy

[permalink] [raw]
Subject: [PATCH] Android / binder: Fix broken walk in binder_node_release()

From: "Compostella, Jeremy" <[email protected]>

This bug can manifest itself in several situations, here is the one that made me
hunt it last week:

When an Android device is encrypted, Android starts all the init services of
core and main levels, then it asks for the password and checks it trying to
mount /data. On success, it kills all the main services, mount /data and
restart all the main services.

Unfortunately, on restart of those main services we observe :

DisplayManager Could not get display information from display manager.
DisplayManager android.os.DeadObjectException
DisplayManager at android.os.BinderProxy.transact(Native Method)
DisplayManager at android.hardware.display.IDisplayManager$Stub$Proxy.getDisplayInfo(IDisplayManager.java:228)
DisplayManager at android.hardware.display.DisplayManagerGlobal.getDisplayInfo(DisplayManagerGlobal.java:117)
DisplayManager at android.hardware.display.DisplayManagerGlobal.getCompatibleDisplay(DisplayManagerGlobal.java:176)
DisplayManager at android.app.ResourcesManager.getDisplayMetricsLocked(ResourcesManager.java:96)
DisplayManager at android.app.ResourcesManager.getDisplayMetricsLocked(ResourcesManager.java:74)
[...]

Which means that the 'display' service is registered into the service_manager
but point to a dead object (understand died process). This error is the first
one of a chain of missing "remote" objects causing the death of processes until
the system can recovery by itself a few seconds later.

The binder driver allows a "process" to ask a notification when a particular
reference die. In that case, the binder driver associate a death object to this
reference.

When the system_server process died, the file descriptor to the binder driver is
automatically released and the binder driver will walk all the references
associated to this process to unallocate them. When such a reference has a
death object associated it will execute a task to notify the death to the
previously register process usually the service_manager process.

The bug is that this walk on all the references is broken due to an
unfornate refactoring made by the following patch :

commit 008fa749e0fe5b2fffd20b7fe4891bb80d072c6a
Author: Mirsal Ennaime <[email protected]>
Date: Tue Mar 12 11:41:59 2013 +0100

which break the loop if the current reference does not have a death object
instead of continuing to the next reference. As a consequence all the next
references will not be correctly unallocate and no death notification will be
sent for them.

Signed-off-by: Jeremy Compostella <[email protected]>
---
drivers/staging/android/binder.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/android/binder.c b/drivers/staging/android/binder.c
index eaec1da..1432d95 100644
--- a/drivers/staging/android/binder.c
+++ b/drivers/staging/android/binder.c
@@ -2904,7 +2904,7 @@ static int binder_node_release(struct binder_node *node, int refs)
refs++;

if (!ref->death)
- goto out;
+ continue;

death++;

@@ -2917,7 +2917,6 @@ static int binder_node_release(struct binder_node *node, int refs)
BUG();
}

-out:
binder_debug(BINDER_DEBUG_DEAD_BINDER,
"node %d now dead, refs %d, death %d\n",
node->debug_id, refs, death);
--
1.7.10.4


2014-02-20 13:35:11

by Dan Carpenter

[permalink] [raw]
Subject: Re: [PATCH] Android / binder: Fix broken walk in binder_node_release()

Fantastic. Thanks.

Reviewed-by: Dan Carpenter <[email protected]>

regards,
dan carpenter

2014-02-21 20:29:05

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] Android / binder: Fix broken walk in binder_node_release()

On Thu, Feb 20, 2014 at 02:22:53PM +0100, Compostella, Jeremy wrote:
> From: "Compostella, Jeremy" <[email protected]>
>
> This bug can manifest itself in several situations, here is the one that made me
> hunt it last week:
>
> When an Android device is encrypted, Android starts all the init services of
> core and main levels, then it asks for the password and checks it trying to
> mount /data. On success, it kills all the main services, mount /data and
> restart all the main services.
>
> Unfortunately, on restart of those main services we observe :
>
> DisplayManager Could not get display information from display manager.
> DisplayManager android.os.DeadObjectException
> DisplayManager at android.os.BinderProxy.transact(Native Method)
> DisplayManager at android.hardware.display.IDisplayManager$Stub$Proxy.getDisplayInfo(IDisplayManager.java:228)
> DisplayManager at android.hardware.display.DisplayManagerGlobal.getDisplayInfo(DisplayManagerGlobal.java:117)
> DisplayManager at android.hardware.display.DisplayManagerGlobal.getCompatibleDisplay(DisplayManagerGlobal.java:176)
> DisplayManager at android.app.ResourcesManager.getDisplayMetricsLocked(ResourcesManager.java:96)
> DisplayManager at android.app.ResourcesManager.getDisplayMetricsLocked(ResourcesManager.java:74)
> [...]
>
> Which means that the 'display' service is registered into the service_manager
> but point to a dead object (understand died process). This error is the first
> one of a chain of missing "remote" objects causing the death of processes until
> the system can recovery by itself a few seconds later.
>
> The binder driver allows a "process" to ask a notification when a particular
> reference die. In that case, the binder driver associate a death object to this
> reference.
>
> When the system_server process died, the file descriptor to the binder driver is
> automatically released and the binder driver will walk all the references
> associated to this process to unallocate them. When such a reference has a
> death object associated it will execute a task to notify the death to the
> previously register process usually the service_manager process.
>
> The bug is that this walk on all the references is broken due to an
> unfornate refactoring made by the following patch :
>
> commit 008fa749e0fe5b2fffd20b7fe4891bb80d072c6a
> Author: Mirsal Ennaime <[email protected]>
> Date: Tue Mar 12 11:41:59 2013 +0100
>
> which break the loop if the current reference does not have a death object
> instead of continuing to the next reference. As a consequence all the next
> references will not be correctly unallocate and no death notification will be
> sent for them.
>
> Signed-off-by: Jeremy Compostella <[email protected]>
> ---
> drivers/staging/android/binder.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)

Thanks, but this fix has already been submitted, and has been part of
the Android kernel git tree for a while with the authorship of someone
else, so I'll use that patch instead when applying it.

greg k-h