Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753239AbdDCSJj (ORCPT ); Mon, 3 Apr 2017 14:09:39 -0400 Received: from mail-wr0-f175.google.com ([209.85.128.175]:36402 "EHLO mail-wr0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753044AbdDCSJg (ORCPT ); Mon, 3 Apr 2017 14:09:36 -0400 MIME-Version: 1.0 In-Reply-To: <20170403132502.GA32491@kroah.com> References: <20170331175341.19889-1-dianders@chromium.org> <20170331192944.GB9744@kroah.com> <20170401064856.GA14971@kroah.com> <20170403132502.GA32491@kroah.com> From: Doug Anderson Date: Mon, 3 Apr 2017 11:09:32 -0700 X-Google-Sender-Auth: iw47A-yw49KXHph1TNPcnZZ6Jy4 Message-ID: Subject: Re: [RFC PATCH] binder: Don't require the binder lock when killed in binder_thread_read() To: Greg KH Cc: arve@android.com, riandrews@android.com, Todd Kjos , devel@driverdev.osuosl.org, "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" , mhocko@suse.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4307 Lines: 96 Hi, On Mon, Apr 3, 2017 at 6:25 AM, Greg KH wrote: > On Sat, Apr 01, 2017 at 07:34:53PM -0700, Doug Anderson wrote: >> Hi, >> >> On Fri, Mar 31, 2017 at 11:48 PM, Greg KH wrote: >> > On Fri, Mar 31, 2017 at 02:00:13PM -0700, Doug Anderson wrote: >> >> On Fri, Mar 31, 2017 at 12:29 PM, Greg KH wrote: >> >> BTW: I presume that nobody has decided that it would be a wise idea to >> >> pick the OOM reaper code back to any stable trees? It seemed a bit >> >> too scary to me, so I wrote a dumber (but easier to backport) solution >> >> that avoided the deadlocks I was seeing. http://crosreview.com/465189 >> >> and the 3 patches above it in case anyone else stumbles on this thread >> >> and is curious. >> > >> > What specific upstream OOM patches are you referring to? I'm always >> > glad to review patches for stable kernels, just email >> > stable@vger.kernel.org the git commit ids and we can take it from there. >> >> +stable >> >> I was wondering about the concept of porting the OOM Reaper back to >> older kernels. The OOM reaper was originally introduced in: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm/oom_kill.c?id=aac453635549699c13a84ea1456d5b0e574ef855 >> >> Basically the problem described in that patch exists in many older >> kernels and I've certainly seen crashes related to this in 3.10, but I >> believe older kernels see the same problems too. >> >> Personally I wouldn't know exactly which patches were important to >> backport and how far to go. One could arbitrarily try to backport up >> to 4.6.7 (since 4.6 was the first kernel to really have the OOM >> reaper) and ignore all the reaper fixes that landed since then. This >> would probably be doable for kernel 4.4, though if anyone was trying >> to support older kernels it might get harder. > > Well, I would need someone to give me a list of commits, and actually > test it to see if it is something that people use/want before I can > queue anything up for a stable release... > > {hint} Here's a list of the patches between 4.5 and 4.6.7 that touch the oom_killer: af8e15cc85a2 oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head bb29902a7515 oom, oom_reaper: protect oom_reaper_list using simpler way e26796066fdf oom: make oom_reaper freezable 29c696e1c6ec oom: make oom_reaper_list single linked 855b01832573 oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task 03049269de43 mm, oom_reaper: implement OOM victims queuing bc448e897b6d mm, oom_reaper: report success/failure 36324a990cf5 oom: clear TIF_MEMDIE after oom_reaper managed to unmap the address space aac453635549 mm, oom: introduce oom reaper 6a618957ad17 mm: oom_kill: don't ignore oom score on exiting tasks 6afcf2895e6f mm,oom: make oom_killer_disable() killable 69b27baf00fa sched: add schedule_timeout_idle() A few of those are code cleanups or are fixing OOM killer bugs unrelated to the problem at hand (OOM killer getting stuck forever because a task won't die), but it may still make sense to take them just to get the oom killer into a consistent / known state (AKA 4.6.7). The problem (and the reason I was so hesitant to provide a list of patches) is that I personally am not a expert on mm. Thus: 1. I don't know if there are any subtle dependencies on other "mm" patches. I can pick the patches above to a 4.4 kernel with only minor conflicts (plus a fix not to look at MM_SHMEMPAGES, which didn't exist in 4.4) and they seem to work OK, but with "mm" I always am worried about minor changes in some other bit of mm code that might be needed to make all the corner cases work right. 2. There are plenty of other fixes to the OOM reaper in 4.7+. For instance this patch: e2fe14564d33 oom_reaper: close race with exiting task References a "Fixes" for the original patch that introduced the OOM reaper, but there are 11 other patches between 4.6 and 4.7 that also sound like they fix some important bugs and I just don't know if they are important things to bring back to linux stable or not... Then another 13 patches between 4.7 and 4.8. Maybe +Michal Hocko would have some opinions of which OOM Reaper patches would be good for picking into linux stable? -Doug