Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp1222661ybh; Tue, 10 Mar 2020 17:37:24 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuUfyHqnrCoMtw97tn1TYSUUTOPU44m/z2vQU7WycjdPvbWnqOH/v25kv92zIe3ai8yBHLi X-Received: by 2002:a9d:20c1:: with SMTP id x59mr352767ota.286.1583887044267; Tue, 10 Mar 2020 17:37:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1583887044; cv=none; d=google.com; s=arc-20160816; b=ZMZBGRDGk1xcYYYpmuKBSLZWObsuvXNc3Hx5lX7iPtJvRc4fJ1eDEqSAvJTHp5+zOB erij/ys88R/v+WU3nbgEZhcE8saFV6HAA3ObrBo+Pmf0cCa3pjbjTnMWo47tRVEcEJrq X2bxFkRLhjicPklUtI3Jf9KJdxVB+GRxyniB6M75AIrXwH/HEeQ3qJ/Jt+tq6/RuUOwO fXLl1ZI5AhdkZECNGyIjdkhM/Xc5Y9sDMPYg62USdm9TOnGNhhtMnIEm5VDAhMaUtj0k 4/OGQzXpkeVpTtg0kDZ9OPhyGCszILDT381STIOCzfAgjFvZdfyQiNBaV6WxeeDykSHp FZ8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=EUgxSFzxsUY/usxfE7ZrpCJnCGl7oImcT7mAmsT3xVk=; b=Mrb00a3+bMK9ZYIadRe23k22ViztRsMFfxyXErG1QTKVRktH+gAUKnZFGrUyDqk6c8 +X95CU7LwVo/SsVM3ZLoryUi5zdovABqcggVsEWp4ZirEBV/Bnao3+MyWH3pzZVBXFBR B7b9C5wtvfVldjjbFRcfgZUhnNH7W9de0KaJivQnqfP+m5aOjbHcEOFmjxRKpyoHo1dt Pj3Kd6PqF8u1Gh4Dxc6B/XcoUFwnJojb/wn3ORKtXv/n1KUNpEWP9HQ0z1sBpygBUEqO F6QJaeKUfammU+nPBJLXfTauuhhhS/3nrlfOf4h1II/EaSnXp0Wf32M1GxkuLnTsPo9Q W2Jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HRsswEgR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y17si179796oto.70.2020.03.10.17.37.10; Tue, 10 Mar 2020 17:37:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=HRsswEgR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727893AbgCKAe4 (ORCPT + 99 others); Tue, 10 Mar 2020 20:34:56 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:46449 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727397AbgCKAe4 (ORCPT ); Tue, 10 Mar 2020 20:34:56 -0400 Received: by mail-pl1-f194.google.com with SMTP id w12so210477pll.13 for ; Tue, 10 Mar 2020 17:34:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=EUgxSFzxsUY/usxfE7ZrpCJnCGl7oImcT7mAmsT3xVk=; b=HRsswEgRKstttsVjHQn36J5/ARhpNxIYX8ivglq3UXHIJGgoqNWCki62T5SzkSme+Q r/8Q3IMb3LeYjVTGjyNpdDUIV7+oQ1MzkVUbdrAjyLlXSx9c5N/5zmgFQC++vpypJDtJ APMR98Weq+5nbefMDo3N8P2Uefg6ODCre3UVjrKyg8gjYYEe7ZDtdotKLavvHHZ6G2Ob /RCmMY+OeX47pmIo7iOKciksUTajd5BfbWhSkEBkffm/oQgG9PaCmyaEaNOHI3S3SJuP 7o78Y4qE7kU9Z/AUreFfSsPJwL2zSnXkvL1XuUWpVZc6JXp8TmPtW+5jF/4GCJWftvEu nDaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=EUgxSFzxsUY/usxfE7ZrpCJnCGl7oImcT7mAmsT3xVk=; b=UscZpYuQy8Z2KmPtegeAFtMgxeDJdZ1Dxg5X/CFmO5DPnZlJtRqmDZMt1FSA+4bdeH RYAWfQkeKerT24CS/X7is+59YgB6kqOnYWmjJ0ithPtnqTKsgK+LhBlPlWaaOlAwACgC 9XCd+96oHZjB+Xvj5K6dLRXRMf1BMKMPxJtTsph3wUdO2nS0TkvdNe/or6ErkqoT4Mvj cZCbw3L9bmgaRt5ou0qRdxhpACamkfSAvRNeRNRJGDATigMOixHOLH67MzxQdbwRKQNn dbf7aajQys/UDwbaCqfTPlW2lGH4ROwstx1wcelLCzP1i7B8yTB/e5CncTSqjlI12thq kpAA== X-Gm-Message-State: ANhLgQ2eFup8qITvbAu4dtpr+WMcgSDuFMois8kegB4u/c1tWBOTmYKQ eHWHLbVdGANwd00jsk6MYPwuig== X-Received: by 2002:a17:90a:8d07:: with SMTP id c7mr610229pjo.94.1583886894884; Tue, 10 Mar 2020 17:34:54 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id 13sm47107657pgo.13.2020.03.10.17.34.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Mar 2020 17:34:54 -0700 (PDT) Date: Tue, 10 Mar 2020 17:34:53 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Andrew Morton cc: Vlastimil Babka , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems In-Reply-To: <20200310171802.128129f6817ef3f77d230ccd@linux-foundation.org> Message-ID: References: <20200310171802.128129f6817ef3f77d230ccd@linux-foundation.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 10 Mar 2020, Andrew Morton wrote: > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -2637,6 +2637,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > > unsigned long reclaimed; > > unsigned long scanned; > > > > + cond_resched(); > > + > > switch (mem_cgroup_protected(target_memcg, memcg)) { > > case MEMCG_PROT_MIN: > > /* > > > Obviously better, but this will still spin wheels until this tasks's > timeslice expires, and we might want to do something to help ensure > that the victim runs next (or soon)? > We used to have a schedule_timeout_killable(1) to address exactly that scenario but it was removed in 4.19: commit 9bfe5ded054b8e28a94c78580f233d6879a00146 Author: Michal Hocko Date: Fri Aug 17 15:49:04 2018 -0700 mm, oom: remove sleep from under oom_lock This is why we don't see this issue on 4.14 guests but we do on 4.19. I had assumed the issue Tetsuo reported that resulted in that patch was still an issue and I preferred to fix the weird UP issue by adding a cond_resched() that is likely needed for the iteration in shrink_node_memcg() anyway. Do we care to optimize for UP systems encountering memcg oom kills? Eh, maybe, but I'm not very interested in opening up a centithread about this. > (And why is shrink_node_memcgs compiled in when CONFIG_MEMCG=n?) > This guest does have CONFIG_MEMCG enabled, it's a memcg oom condition. But unrelated to this patch, I think it's just a weird naming for it. The do-while loop in shrink_node_memcgs() actually uses memcg = NULL for the non-memcg case and is responsible for calling into page and slab reclaim.