Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp3305219ybh; Mon, 16 Mar 2020 20:31:12 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtEg6dnLfm5DdqgHzq8IJmztyqcDWNfN56TwAuLoCXP1DrrDzzirCfb8HVnWW33zSiVlBnJ X-Received: by 2002:a9d:66d5:: with SMTP id t21mr2018447otm.284.1584415872006; Mon, 16 Mar 2020 20:31:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584415871; cv=none; d=google.com; s=arc-20160816; b=lnn/RWGf5NEJ7Yv/szofdaQjPonCop0l74lFpCRqD/y+F29iCa3DUglq2eJ1vvomB0 vf0Gi4154FO8MD2emRhSCc1H3oOrwCAB7Y1+G5l6OdjKiLEmkXPifomY+M0SUzbkhUWU /jWOd6gbhPAxQDs0h9yKmfa6EpDpO5lWH9LeAjKaCpO9Fh+immqyQJW5/uVaReJ9x4C/ YZwlfKUClbWoWUC2QB3zqFynGZTI9H/XSQDA6eajfT/TYcMW0PWQcTfb/SvtZguNyC9q VQuvaScTiC5zQqG5UVlGstlVenlRq1o5I6XoKioZtZ9LYWXHwb5wFAklBkJW1rNRYrY6 F76A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :references:date:mime-version:cc:to:from:subject:message-id; bh=omZY4Jd4M8QS7RuA/kwSasnMG5ZCg2WKmng/QlPhbLM=; b=v+475VG5E7G/hoTje8pExYCZTq/EX4EQqodl12MO9BuXLYzE5NnxUVLJvfJgp7DRG5 MZ1aHnQHQicRXOKF4Up500gaTq0byYI5A4GkKf6vv13Cxl9OYunOl+vASL2k7wuB/JRg 8Dkfrtg3dGA4IadE88PvT1bGZP4eA0KgZCQQs4IInnBZ64p8KoZqTnI6B2w1D0vfXqHf xC9XZfqNLQVudAX/Z08OCNCAznEhtpzcBmP5czJNvEaQeBMuPbz+FkqUVy4x9XR2uogr opezk9qbZkrnRB8rCz786zjyhKqRni/H8YNLGWVJRBzwkoVVV+cG57ZtT0PyKfUxEaAZ pBIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k137si1033549oih.226.2020.03.16.20.31.00; Mon, 16 Mar 2020 20:31:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726890AbgCQDTK (ORCPT + 99 others); Mon, 16 Mar 2020 23:19:10 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:58659 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726803AbgCQDTK (ORCPT ); Mon, 16 Mar 2020 23:19:10 -0400 Received: from fsav401.sakura.ne.jp (fsav401.sakura.ne.jp [133.242.250.100]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 02H3IpDp047482; Tue, 17 Mar 2020 12:18:51 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav401.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav401.sakura.ne.jp); Tue, 17 Mar 2020 12:18:51 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav401.sakura.ne.jp) Received: from www262.sakura.ne.jp (localhost [127.0.0.1]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 02H3Ip6f047472; Tue, 17 Mar 2020 12:18:51 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: (from i-love@localhost) by www262.sakura.ne.jp (8.15.2/8.15.2/Submit) id 02H3IpSx047471; Tue, 17 Mar 2020 12:18:51 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Message-Id: <202003170318.02H3IpSx047471@www262.sakura.ne.jp> X-Authentication-Warning: www262.sakura.ne.jp: i-love set sender to penguin-kernel@i-love.sakura.ne.jp using -f Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems From: Tetsuo Handa To: David Rientjes Cc: Andrew Morton , Vlastimil Babka , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org MIME-Version: 1.0 Date: Tue, 17 Mar 2020 12:18:51 +0900 References: <8395df04-9b7a-0084-4bb5-e430efe18b97@i-love.sakura.ne.jp> In-Reply-To: Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org David Rientjes wrote: > On Sat, 14 Mar 2020, Tetsuo Handa wrote: > > If current thread is an OOM victim, schedule_timeout_killable(1) will give other > > threads (including the OOM reaper kernel thread) CPU time to run, by leaving > > try_charge() path due to should_force_charge() == true and reaching do_exit() path > > instead of returning to userspace code doing "for (;;);". > > > > Unless the problem is that current thread cannot reach should_force_charge() check, > > schedule_timeout_killable(1) should work. > > > > No need to yield if current is the oom victim, allowing the oom reaper to > run when it may not actually be able to free memory is not required. It > increases the likelihood that some other process schedules and is unable > to yield back due to the memcg oom condition such that the victim doesn't > get a chance to run again. > > This happens because the victim is allowed to overcharge but other > processes attached to an oom memcg hierarchy simply fail the charge. We > are then reliant on all memory chargers in the kernel to yield if their > charges fail due to oom. It's the only way to allow the victim to > eventually run. > > So the only change that I would make to your patch is to do this in > mem_cgroup_out_of_memory() instead: > > if (!fatal_signal_pending(current)) > schedule_timeout_killable(1); > > So we don't have this reliance on all other memory chargers to yield when > their charge fails and there is no delay for victims themselves. I see. You want below functions for environments where current thread can fail to resume execution for long if current thread once reschedules (e.g. UP kernel, many threads contended on one CPU). /* * Give other threads CPU time, unless current thread was already killed. * Used when we prefer killed threads to continue execution (in a hope that * killed threads terminate quickly) over giving other threads CPU time. */ signed long __sched schedule_timeout_killable_expedited(signed long timeout) { if (unlikely(fatal_signal_pending(current))) return timeout; return schedule_timeout_killable(timeout); } /* * Latency reduction via explicit rescheduling in places that are safe, * but becomes no-op if current thread was already killed. Used when we * prefer killed threads to continue execution (in a hope that killed * threads terminate quickly) over giving other threads CPU time. */ int cond_resched_expedited(void) { if (unlikely(fatal_signal_pending(current))) return 0; return cond_resched(); } > > [ I'll still propose my change that adds cond_resched() to > shrink_node_memcgs() because we can see need_resched set for a > prolonged period of time without scheduling. ] As long as there is schedule_timeout_killable(), I'm fine with adding cond_resched() in other places. > > If you agree, I'll propose your patch with a changelog that indicates it > can fix the soft lockup issue for UP and can likely get a tested-by for > it. > Please go ahead.