Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933460AbdCaTaB (ORCPT ); Fri, 31 Mar 2017 15:30:01 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:47822 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932356AbdCaTaA (ORCPT ); Fri, 31 Mar 2017 15:30:00 -0400 Date: Fri, 31 Mar 2017 21:29:44 +0200 From: Greg KH To: Douglas Anderson Cc: arve@android.com, riandrews@android.com, tkjos@google.com, devel@driverdev.osuosl.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH] binder: Don't require the binder lock when killed in binder_thread_read() Message-ID: <20170331192944.GB9744@kroah.com> References: <20170331175341.19889-1-dianders@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170331175341.19889-1-dianders@chromium.org> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1500 Lines: 35 On Fri, Mar 31, 2017 at 10:53:41AM -0700, Douglas Anderson wrote: > Sometimes when we're out of memory the OOM killer decides to kill a > process that's in binder_thread_read(). If we happen to be waiting > for work we'll get the kill signal and wake up. That's good. ...but > then we try to grab the binder lock before we return. That's bad. > > The problem is that someone else might be holding the one true global > binder lock. If that one other process is blocked then we can't > finish exiting. In the worst case, the other process might be blocked > waiting for memory. In that case we'll have a really hard time > exiting. > > On older kernels that don't have the OOM reaper (or something > similar), like kernel 4.4, this is a really big problem and we end up > with a simple deadlock because: > * Once we pick a process to OOM kill we won't pick another--we first > wait for the process we picked to die. The reasoning is that we've > given the doomed process access to special memory pools so it can > quit quickly and we don't have special pool memory to go around. > * We don't have any type of "special access donation" that would give > the mutex holder our special access. > > On kernel 4.4 w/ binder patches, we easily see this happen: How does your change interact with the recent "break up the binder big lock" patchset: https://android-review.googlesource.com/#/c/354698/ Have you tried that series out to see if it helps out any? thanks, greg k-h