Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp3146949ybh; Mon, 16 Mar 2020 17:00:20 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtULxoR/GRD7vxKKP37Z6SocXDJMOzp+QeK2qQc9B/Y5DWPNwHTIa5l5T2h4SvFf5po2HN3 X-Received: by 2002:a54:4501:: with SMTP id l1mr1552730oil.33.1584403220247; Mon, 16 Mar 2020 17:00:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584403220; cv=none; d=google.com; s=arc-20160816; b=P0TyvpTtAmOsY9M5qbn7wxjqHb7rZCYabjY/gjXQ4L9vzR0o0YvQYEo3sILMPtCS/g Nxu3msxLN17nCpLqmSHhMYO02ooow2SvsNQiTO/WBGnw6S3jC26pGAHk20OVN1MBEkCd PgobemowbVL9XdyjEd35NCNkiqqdba++G+C1thdp1nTr8R9DzOrQZqmpEBlFMc6NIjNI 8RpUUQwyFmsC2UQcjNYrI4OfNvxGegnLs0lAfTXU1aSj7jV/ilshVwdr1Q6tjZuWeq1X wNjV/xnJkUUIxCerPlzD/OfwFZrximI2GQ1Ll9gmNlUzVsuBUOvIAHB9CM1Vz7GQ5z5H zjqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=qEANOgtBSueUJD8m8QfmAMA6kpeJTGDGbSVYwjdAhYY=; b=P9suI36XZz12+Dd71OgKvbVN11uTKW0vYxsZGHwYLM8stiSpmCjumQeoLwiG6+qEmb lq/KiWs58QrPcweTXzzioF8X2jLDjO0IwtHnL3qWIO2Q0t6uSfEckPnNtFC2TPywzKsQ 0a7xE4hno+c/faqdImOiFsHBpi9wWr+vpY5Zu+dhrYZROzQLyrX3s06maoMwyw2wuDNy AcAWlEO4u9iz5vUD16Ca7Tvli60hAhgwPG/Avdtk49301eb1i4QPykEl44xXR5WDxUa7 e53d7iVLfxexcC3aDVecunsGpcfhF6qW6BKXnIIA1o9ST9magvTlHJ2+WUZWlwWzWyNL H9Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LZFs1AHZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p203si744357oic.214.2020.03.16.17.00.08; Mon, 16 Mar 2020 17:00:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LZFs1AHZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733016AbgCPX7G (ORCPT + 99 others); Mon, 16 Mar 2020 19:59:06 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:35557 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732932AbgCPX7F (ORCPT ); Mon, 16 Mar 2020 19:59:05 -0400 Received: by mail-pf1-f196.google.com with SMTP id u68so10865729pfb.2 for ; Mon, 16 Mar 2020 16:59:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=qEANOgtBSueUJD8m8QfmAMA6kpeJTGDGbSVYwjdAhYY=; b=LZFs1AHZzWCXFBABx/gWhz/foi529c2VuScqMgysiZIwQNb4XFuH27xKV9ySQQcphv 7bc4HItVdEKt7qsWHI7E8YT9HiIFfwfvuLJf9b8uipeVYZLk86YR8/kD365ty5Lk1o0F kYUVAMzZPmjXfabJFjpr6Z+aJWBWP8WVcVvxudBsXr7mb8CZCo4aBT2BsWmPH9v1zCTM wcqg5komC29tmplhp7S7hpoWQfrhDZldsd9HoITurHRsw+Kq1nOvvgTeazAc46MAg/Tg INZLgApQgCZLSwknjOwJFUKxJjOjuajrYcnRCvqlsaiQahCa0IudUoLXG+3lRyx4qEq/ pqCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=qEANOgtBSueUJD8m8QfmAMA6kpeJTGDGbSVYwjdAhYY=; b=cVDnaVg+RLeXWoX3qn5dRM2DeY6xm7NdWDxm1vF6IAmj+4rpFnpB0GHdrpFXmT6tLX Mnr4d8hkofRNHJleqETRBnTImuxqy1CNMP0SYaBZlSnAiorWYAliv0tFPHoD1efPKvzJ k3g5upEUfC5mrfWOdFnOavIYlJbR17ARbGG12EynklzTmu+MG68USYo8p+V1H67Ualel kcSz9TdFlHXwgokNaBELtBsPsp5JRBBKvKxZbRV9QMZpViuojFC5ARx1cEbkpQxdH8PI byktDo8998cgpeStMArVaLwnERcBfvkI/H7BPHlXU+2zBPUggr3GxDvjmkPxuYbD16cQ hEgg== X-Gm-Message-State: ANhLgQ2YPnSTjE7N8VriHLgEPFQFHB6sN7/mspAdidZ2ygQ71OmXgjGn eGFDDjh4zYfEvQ2M1dkfL4sgGGMwCRs= X-Received: by 2002:a63:2ec1:: with SMTP id u184mr2169699pgu.446.1584403144583; Mon, 16 Mar 2020 16:59:04 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id z16sm930252pfr.138.2020.03.16.16.59.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Mar 2020 16:59:03 -0700 (PDT) Date: Mon, 16 Mar 2020 16:59:02 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Tetsuo Handa cc: Andrew Morton , Vlastimil Babka , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems In-Reply-To: <8395df04-9b7a-0084-4bb5-e430efe18b97@i-love.sakura.ne.jp> Message-ID: References: <202003120012.02C0CEUB043533@www262.sakura.ne.jp> <202003130015.02D0F9uT079462@www262.sakura.ne.jp> <8395df04-9b7a-0084-4bb5-e430efe18b97@i-love.sakura.ne.jp> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 14 Mar 2020, Tetsuo Handa wrote: > > If current thread is > > an OOM victim, schedule_timeout_killable(1) will give other threads (including > > the OOM reaper kernel thread) CPU time to run. > > If current thread is an OOM victim, schedule_timeout_killable(1) will give other > threads (including the OOM reaper kernel thread) CPU time to run, by leaving > try_charge() path due to should_force_charge() == true and reaching do_exit() path > instead of returning to userspace code doing "for (;;);". > > Unless the problem is that current thread cannot reach should_force_charge() check, > schedule_timeout_killable(1) should work. > No need to yield if current is the oom victim, allowing the oom reaper to run when it may not actually be able to free memory is not required. It increases the likelihood that some other process schedules and is unable to yield back due to the memcg oom condition such that the victim doesn't get a chance to run again. This happens because the victim is allowed to overcharge but other processes attached to an oom memcg hierarchy simply fail the charge. We are then reliant on all memory chargers in the kernel to yield if their charges fail due to oom. It's the only way to allow the victim to eventually run. So the only change that I would make to your patch is to do this in mem_cgroup_out_of_memory() instead: if (!fatal_signal_pending(current)) schedule_timeout_killable(1); So we don't have this reliance on all other memory chargers to yield when their charge fails and there is no delay for victims themselves. [ I'll still propose my change that adds cond_resched() to shrink_node_memcgs() because we can see need_resched set for a prolonged period of time without scheduling. ] If you agree, I'll propose your patch with a changelog that indicates it can fix the soft lockup issue for UP and can likely get a tested-by for it.