Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933532AbcCJApV (ORCPT ); Wed, 9 Mar 2016 19:45:21 -0500 Received: from gum.cmpxchg.org ([85.214.110.215]:45580 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752281AbcCJApO (ORCPT ); Wed, 9 Mar 2016 19:45:14 -0500 Date: Wed, 9 Mar 2016 19:45:00 -0500 From: Johannes Weiner To: Andrew Morton Cc: Tetsuo Handa , mhocko@kernel.org, linux-mm@kvack.org, rientjes@google.com, linux-kernel@vger.kernel.org, mhocko@suse.com Subject: Re: [PATCH 2/2] oom-clear-tif_memdie-after-oom_reaper-managed-to-unmap-the-address-space-fix Message-ID: <20160310004500.GA7374@cmpxchg.org> References: <1457442737-8915-1-git-send-email-mhocko@kernel.org> <1457442737-8915-3-git-send-email-mhocko@kernel.org> <20160309132142.80d0afbf0ae398df8e2adba8@linux-foundation.org> <201603100721.CDC86433.OMFOVOHSJFLFQt@I-love.SAKURA.ne.jp> <20160309224829.GA5716@cmpxchg.org> <20160309150853.2658e3bc75907e404cf3ca33@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160309150853.2658e3bc75907e404cf3ca33@linux-foundation.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 888 Lines: 20 On Wed, Mar 09, 2016 at 03:08:53PM -0800, Andrew Morton wrote: > On Wed, 9 Mar 2016 17:48:29 -0500 Johannes Weiner wrote: > > > However, I disagree with your changelog. > > What text would you prefer? I'd just keep the one you had initially. Or better, this modified version: When the OOM killer scans tasks and encounters a PF_EXITING one, it force-selects that task regardless of the score. The problem is that if that task got stuck waiting for some state the allocation site is holding, the OOM reaper can not move on to the next best victim. Frankly, I don't even know why we check for exiting tasks in the OOM killer. We've tried direct reclaim at least 15 times by the time we decide the system is OOM, there was plenty of time to exit and free memory; and a task might exit voluntarily right after we issue a kill. This is testing pure noise. Remove it.