Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp3161749imm; Fri, 25 May 2018 00:27:05 -0700 (PDT) X-Google-Smtp-Source: AB8JxZoPgynw7gzS+g/KMG0mt+e1CQdQuVRdiu8N5ZngjIvtP6atE54SPXCcWpX1F7dKPGOX0bxC X-Received: by 2002:a63:7315:: with SMTP id o21-v6mr1070830pgc.253.1527233225204; Fri, 25 May 2018 00:27:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527233225; cv=none; d=google.com; s=arc-20160816; b=aqAw3HS5ZRyJMeBqz0ErxHIjkqnwf0X27fybxX8P5VTUxoC7OYBKjre4yLvOB9F1/A PSPrnNy+/ExMkbzMlXAJbvFdu5CPRBackzkCaLUh/rIElCYICRsDcW95Qz+uu6aUyHUh jd5NnO0vdHVDlBAzFDGnjlkL207Sg5LrS22JKs2RYiH+oFtLPWlv1yuiikaMYgkaJ4Zg FrkUmXtW7+QsdmJ6d61HLJGKrCnPuaJUxOlDr1a818RBxbp699aoqA5TaTiF9UaGuNpT R18jFu2cFy+BxMLo6nWVCMv04k7/2YkShvxhEm9QZhQbaPiAX1aCEHlvPVE6ZgWm3k4Z JWLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=WoBPOuDewGwYYp/SFuP6QlWoKsi/XyN2YzO9dZCR8/4=; b=eCLC04gsdzMigK+d1PjFG7OikR394k/FG4Osl3/2mC29lvPEIZD/HjXMMW/Wyb3pMt cCrW1fVWC/Ifha5SWP6aBvFaeY/dVXVry8fybDcIF+H+1wYXuyOhbxQT0upomESpCVG2 QCbHtrGB25KeS3KHotRTLRlC0c2j5ZifXtUxK5Z9VRn1wPZ+p2sObIXoNIUEAuWSrh5T PjtJHQG/lCepgI4CM4VKy2ItDxTXoYPA4LuYcXqVaw3O4GSkuhV5W26b1XkbcT0/PUAf aUUC9e7BKymGjHa0f3hHfR41UE0i0BQHyS7geFRkVWSy6IZLjvi7t0h2462zY2S3sHf9 OmAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p33-v6si22902301pld.318.2018.05.25.00.26.50; Fri, 25 May 2018 00:27:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935822AbeEYH0k (ORCPT + 99 others); Fri, 25 May 2018 03:26:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:39001 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935134AbeEYH0i (ORCPT ); Fri, 25 May 2018 03:26:38 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 954D7AD03; Fri, 25 May 2018 07:26:37 +0000 (UTC) Date: Fri, 25 May 2018 09:26:36 +0200 From: Michal Hocko To: David Rientjes Cc: Tetsuo Handa , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [rfc patch] mm, oom: fix unnecessary killing of additional processes Message-ID: <20180525072636.GE11881@dhcp22.suse.cz> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 24-05-18 14:22:53, David Rientjes wrote: > The oom reaper ensures forward progress by setting MMF_OOM_SKIP itself if > it cannot reap an mm. This can happen for a variety of reasons, > including: > > - the inability to grab mm->mmap_sem in a sufficient amount of time, > > - when the mm has blockable mmu notifiers that could cause the oom reaper > to stall indefinitely, > > but we can also add a third when the oom reaper can "reap" an mm but doing > so is unlikely to free any amount of memory: > > - when the mm's memory is fully mlocked. > > When all memory is mlocked, the oom reaper will not be able to free any > substantial amount of memory. It sets MMF_OOM_SKIP before the victim can > unmap and free its memory in exit_mmap() and subsequent oom victims are > chosen unnecessarily. This is trivial to reproduce if all eligible > processes on the system have mlocked their memory: the oom killer calls > panic() even though forward progress can be made. > > This is the same issue where the exit path sets MMF_OOM_SKIP before > unmapping memory and additional processes can be chosen unnecessarily > because the oom killer is racing with exit_mmap(). > > We can't simply defer setting MMF_OOM_SKIP, however, because if there is > a true oom livelock in progress, it never gets set and no additional > killing is possible. > > To fix this, this patch introduces a per-mm reaping timeout, initially set > at 10s. It requires that the oom reaper's list becomes a properly linked > list so that other mm's may be reaped while waiting for an mm's timeout to > expire. No timeouts please! The proper way to handle this problem is to simply teach the oom reaper to handle mlocked areas. -- Michal Hocko SUSE Labs