Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1098076ybh; Wed, 18 Mar 2020 15:04:26 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtH08kYI8AmCuraH5OQaVtcVKXS/MBIzegFA3MvzqVKJO2kUYE0LA+Nm8onmik30Q5FVy7/ X-Received: by 2002:a9d:5ad:: with SMTP id 42mr6025267otd.231.1584569066375; Wed, 18 Mar 2020 15:04:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584569066; cv=none; d=google.com; s=arc-20160816; b=Ud95asTHMgFO7jEDLRe8IGlIHARfGm0a811/JS1LrCX6Y6vjo05HqXJDsYGipAbcSn rMob7zLJMQGRsUmw226gd+Jko2d2fadZVu9STWQGHDf4Bi/smDuDTMUkVuxrz7ucAgmF thhJmImV+zuTIuZLP8uZBeIoe0xym/xop/C/ZcD8spCTBghAIdUiWq49D63ifkCMvlPp dag6il3erFqZhMP8sRdVWphOHUS9z7YGHvepTc1oPAYoKdcyAzJQCVOpaT1POwaRy8S2 fgA3yt2j9H5Hsa1H5ro8Ht0rpLjiu/nRg505tFnm45ZVVnK57WqI/eRn/aWokbitSr32 iXrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=/9wz4K28LAWC7NirvuQCrwV33aT0PUzfnYZQGhi4Hlc=; b=ojToiIJqcHSPs2kDMcvDsu7u1ADPNFuHmPbplOwqf7uORYCR5lTBKt0RuJqBwiSUFO 2wa9ZGdVsW7Tl1TVFxSJsD6V+QFU59Cagd+drCXvMRHNYjJJxp6aBWRP/Mc/4MuqzVVm MfBwJhdDIDA82HBzt6gXNkDgH8mLfdG31nEsT+b6aUZgr4a9MVs7zUYU723QuhD+24+Y FLRboSIx9Ny4+Qa1Km7DFlCKIBU8PinrD9NlpMHHh1BIKbxlRXN1fknX7/AeIZNRycue D9KqIVlpjrm4vge5zCQ4Ag+XMSrUbxXgXD0Swcm4TRRf8aP4L2TGLnAJNnZjfFY/1Vkr 2OQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="bE09/JGP"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 63si153139otl.86.2020.03.18.15.04.12; Wed, 18 Mar 2020 15:04:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="bE09/JGP"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727154AbgCRWD4 (ORCPT + 99 others); Wed, 18 Mar 2020 18:03:56 -0400 Received: from mail-pj1-f65.google.com ([209.85.216.65]:34018 "EHLO mail-pj1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbgCRWD4 (ORCPT ); Wed, 18 Mar 2020 18:03:56 -0400 Received: by mail-pj1-f65.google.com with SMTP id q16so1495966pje.1 for ; Wed, 18 Mar 2020 15:03:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=/9wz4K28LAWC7NirvuQCrwV33aT0PUzfnYZQGhi4Hlc=; b=bE09/JGP0nFOJl4NicZNT+WoaEKdFXVjqoww2q1tQvXBWFpwHjwf50C6qjErbHugg5 mNEaNVdU5zBo8sTuRUOrgZ10KUXDa2n+MGdGLt6YEh6baJASYuafJBmTTsV1tD4skcbi ZZ8U/ee3X1jUtfAhmeJDYrPuh2NfK/sqYTMvJOw87fMMoFUe+DGBkZdb10AENNq8wKNZ azFOLtmMIHIrOkgDtvadTDUS1xxbRN4LlSQSDBbNKy0Zyfcg1YkU7f9E1+QLBOdPEzER nd1mBwQxyZX9RUteWlQmgT07J6uGqkZXd8t0Br1DN2nmIXXlD5t3Q0NUd4qB4YwwQHUM Z/tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=/9wz4K28LAWC7NirvuQCrwV33aT0PUzfnYZQGhi4Hlc=; b=lEfLafRIJLIUYIktU9MFrQBS0AnzYEXmxL2NXW+e5YiYvTUPgspMH2Q2joafdjQ9rI rrBoWRBuSOaGxM+0qkuBa2oyiJvPmlh4l5EM4NQ802yfvs5G7WZDXMNYhumrWnzGL2AJ N+2yrvsbGO8bhpaAZnvcP5Kl1EdKpDHgXkWSJ/oxM0Iyc+qFjTER8YVcoDtYTrFxAqDO LpcoJyZdVRf/nksdfMOHx+oRW7i19Px60WYwmw6MKqLF6sdiGDRVJdb/mV3z3F2sku6H FK1rdCmI96pGS75arOiqPkXjtvFiHCtSOcMduPl/BtgD5AREwEiFeQF9SGn1GndOftjt ObDA== X-Gm-Message-State: ANhLgQ1jyIV1R9/fz6D1tc/2unS5THjPLmyHVNBc9Kf1UZHskXCMEbHF sLrJ613bYEk1D0kiOgKB1olDYg== X-Received: by 2002:a17:90a:9501:: with SMTP id t1mr349975pjo.108.1584569034517; Wed, 18 Mar 2020 15:03:54 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id r14sm2687pjj.48.2020.03.18.15.03.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2020 15:03:53 -0700 (PDT) Date: Wed, 18 Mar 2020 15:03:52 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton cc: Michal Hocko , Tetsuo Handa , Vlastimil Babka , Robert Kolchmeyer , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [patch v3] mm, oom: prevent soft lockup on memcg oom for UP systems In-Reply-To: Message-ID: References: <8395df04-9b7a-0084-4bb5-e430efe18b97@i-love.sakura.ne.jp> <202003170318.02H3IpSx047471@www262.sakura.ne.jp> <20200318094219.GE21362@dhcp22.suse.cz> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a process is oom killed as a result of memcg limits and the victim is waiting to exit, nothing ends up actually yielding the processor back to the victim on UP systems with preemption disabled. Instead, the charging process simply loops in memcg reclaim and eventually soft lockups. For example, on an UP system with a memcg limited to 100MB, if three processes each charge 40MB of heap with swap disabled, one of the charging processes can loop endlessly trying to charge memory which starves the oom victim. Memory cgroup out of memory: Killed process 808 (repro) total-vm:41944kB, anon-rss:35344kB, file-rss:504kB, shmem-rss:0kB, UID:0 pgtables:108kB oom_score_adj:0 watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [repro:806] CPU: 0 PID: 806 Comm: repro Not tainted 5.6.0-rc5+ #136 RIP: 0010:shrink_lruvec+0x4e9/0xa40 ... Call Trace: shrink_node+0x40d/0x7d0 do_try_to_free_pages+0x13f/0x470 try_to_free_mem_cgroup_pages+0x16d/0x230 try_charge+0x247/0xac0 mem_cgroup_try_charge+0x10a/0x220 mem_cgroup_try_charge_delay+0x1e/0x40 handle_mm_fault+0xdf2/0x15f0 do_user_addr_fault+0x21f/0x420 page_fault+0x2f/0x40 Make sure that once the oom killer has been called that we forcibly yield if current is not the chosen victim regardless of priority to allow for memory freeing. The same situation can theoretically occur in the page allocator, so do this after dropping oom_lock there as well. We used to have a short sleep after oom killing, but commit 9bfe5ded054b ("mm, oom: remove sleep from under oom_lock") removed it because sleeping inside the oom_lock is dangerous. This patch restores the sleep outside of the lock. Suggested-by: Tetsuo Handa Tested-by: Robert Kolchmeyer Cc: stable@vger.kernel.org Signed-off-by: David Rientjes --- mm/memcontrol.c | 6 ++++++ mm/page_alloc.c | 6 ++++++ 2 files changed, 12 insertions(+) diff --git a/mm/memcontrol.c b/mm/memcontrol.c --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1576,6 +1576,12 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, */ ret = should_force_charge() || out_of_memory(&oc); mutex_unlock(&oom_lock); + /* + * Give a killed process a good chance to exit before trying to + * charge memory again. + */ + if (ret) + schedule_timeout_killable(1); return ret; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3861,6 +3861,12 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, } out: mutex_unlock(&oom_lock); + /* + * Give a killed process a good chance to exit before trying to + * allocate memory again. + */ + if (*did_some_progress) + schedule_timeout_killable(1); return page; }