Received: by 2002:ac0:aa62:0:0:0:0:0 with SMTP id w31-v6csp2341987ima; Mon, 22 Oct 2018 08:14:05 -0700 (PDT) X-Google-Smtp-Source: AJdET5dKIeH9V03ZHbjsQ/gAHLOH+2UlfIt9vYf0ZKpetVkF52T73L7+XhjRI6t6x9gOxMuD+FNX X-Received: by 2002:a17:902:7847:: with SMTP id e7-v6mr5833672pln.104.1540221245854; Mon, 22 Oct 2018 08:14:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540221245; cv=none; d=google.com; s=arc-20160816; b=ncvG0RFWEcHsDnZgnd69jAdQYJE7dWmSzYgzSrUvPGbOLGLf1WKYs9aL5oLGWXtySW fRPVpo2vOnfdfTrAvqHK4d/kWXNI8XsgP+TmyqHn0xcNwakc/U0eOn/58mvlfICIP6Rh wYMeGdrRdst/6hlldINuPTkARUA34SJdgthMBL5nvZYqLYydqWtqAar0FYpJ/LSbfueh b6kGAVXcAPA1Y6TlDtTNcaPc0ITqcFr/KnHMT56wHPgsnqkHK6+S21rB7mXp08Lb1Lo4 0JdlAZxJTJWtNQ6e3gwNty1EacvM8fdEoCRZD/V8iBebY6me1npK4IuINRF9z4wXiAgm VrPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=NIuwemhAVNa/akbuzj6kjYPL+5LvT91Jsg/KDoXdr7Q=; b=IbQewqZifVWy9y6FecWBVGUpZc9ajh67KEeks8XdGDLCY09W986+CvCCXNJmi/UIiC EZfG5HaqknIq5pTd8xMS0vMwH8KXdHAC1GzLfkEt0VNZ6+yoovG1o5be+n7zHymxn4Mg bJS6WPqHVukiowMPcp8SVVnVBeb8/119A1l+bQIK6dFgItlwnoJOeUWN93i592fqioIX ae3DTF995zcChniSViVCliNwkKCD+6CHa3dq/Mf4yZkS076+2QRY1bkApyg31y5IRmDs VMJMKw1HcjagEXHPLmirLe2n0U1n4CR48NxvgiX+3seSp5UDhjzuRpBaFS/h5HuBmjFP mKrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b190-v6si32936165pfb.166.2018.10.22.08.13.41; Mon, 22 Oct 2018 08:14:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728156AbeJVXcB (ORCPT + 99 others); Mon, 22 Oct 2018 19:32:01 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:38592 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727218AbeJVXcA (ORCPT ); Mon, 22 Oct 2018 19:32:00 -0400 Received: from fsav305.sakura.ne.jp (fsav305.sakura.ne.jp [153.120.85.136]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id w9MFCouv093802; Tue, 23 Oct 2018 00:12:50 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav305.sakura.ne.jp (F-Secure/fsigk_smtp/530/fsav305.sakura.ne.jp); Tue, 23 Oct 2018 00:12:50 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/530/fsav305.sakura.ne.jp) Received: from [192.168.1.8] (softbank060157066051.bbtec.net [60.157.66.51]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id w9MFCn1X093799 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 23 Oct 2018 00:12:49 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks To: Michal Hocko Cc: linux-mm@kvack.org, Johannes Weiner , David Rientjes , Andrew Morton , LKML References: <20181022071323.9550-1-mhocko@kernel.org> <20181022071323.9550-3-mhocko@kernel.org> <20181022120308.GB18839@dhcp22.suse.cz> <0a84d3de-f342-c183-579b-d672c116ba25@i-love.sakura.ne.jp> <20181022134315.GF18839@dhcp22.suse.cz> From: Tetsuo Handa Message-ID: <2deec266-2eaf-f754-ae94-d290f10c79ec@i-love.sakura.ne.jp> Date: Tue, 23 Oct 2018 00:12:48 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20181022134315.GF18839@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018/10/22 22:43, Michal Hocko wrote: > On Mon 22-10-18 22:20:36, Tetsuo Handa wrote: >> I mean: >> >> mm/memcontrol.c | 3 +- >> mm/oom_kill.c | 111 +++++--------------------------------------------------- >> 2 files changed, 12 insertions(+), 102 deletions(-) > > This is much larger change than I feel comfortable with to plug this > specific issue. A simple and easy to understand fix which doesn't add > maintenance burden should be preferred in general. > > The code reduction looks attractive but considering it is based on > removing one of the heuristics to prevent OOM reports in some case it > should be done on its own with a careful and throughout justification. > E.g. how often is the heuristic really helpful. I think the heuristic is hardly helpful. Regarding task_will_free_mem(current) condition in out_of_memory(), this served for two purposes. One is that mark_oom_victim() is not yet called on current thread group when mark_oom_victim() was already called on other thread groups. But such situation disappears by removing task_will_free_mem() shortcuts and forcing for_each_process(p) loop in __oom_kill_process(). The other is that mark_oom_victim() is not yet called on any thread groups when all thread groups are exiting. In that case, we will fail to wait for current thread group to release its mm... But it is unlikely that only threads which task_will_free_mem(current) returns true can call out_of_memory() (note that task_will_free_mem(p) returns false if p->mm == NULL). I think it is highly unlikely to hit task_will_free_mem(p) condition in oom_kill_process(). To hit it, the candidate who was chosen due to the largest memory user has to be already exiting. However, if already exiting, it is likely the candidate already released its mm (and hence no longer the largest memory user). I can't say such race never happens, but I think it is unlikely. Also, since task_will_free_mem(p) returns false if thread group leader's mm is NULL whereas oom_badness() from select_bad_process() evaluates any mm in that thread group and returns a thread group leader, this heuristic is incomplete after all. > > In principle I do not oppose to remove the shortcut after all due > diligence is done because this particular one had given us quite a lot > headaches in the past. >