Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3950459imm; Mon, 6 Aug 2018 13:38:43 -0700 (PDT) X-Google-Smtp-Source: AAOMgpe+qy63aJBRtJRjLB+zHs/IzrZIT9oaxx0/UTB2y2NNRzkAJ602k9x1b2j9vBTJjRd1Ggqh X-Received: by 2002:a17:902:8d96:: with SMTP id v22-v6mr15095873plo.176.1533587923114; Mon, 06 Aug 2018 13:38:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533587923; cv=none; d=google.com; s=arc-20160816; b=VySWNsoRjHkXo4s/e7ect46IfjDYXciHmiIT4xrB63TH4XPOI39zGY5ztILmKPSRAD UmvGebSuJlbXiQX+EjV03xQRsFKixC+r7j9pf1QxpAWxFeSo41nZczIO/PEkGszSck8P cY+gbysMXFnDKCVMSLTiBhfMX5oKq/g8SIoiXYptTXGGUlzuzzaVjCZ7Qx/qZYNmGex2 kcjyjEbWqk8Se7wIwYFgEFZiiWVN6vdz5lki7Gy+HAwA/dl784UZ7Bc4hmipzJkl4krz f0YVz8DU0Gp4nrkdo6VAwCaq0Bzz6Dk/vRBFLEeyOyqjt2CxiP0SWy3aDXafrIBKfG4x yOzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=5boXtHopClymNYmIT8BuarlyTzGsiYL3p/coSCB3blA=; b=UEIVjpLcd0SKxv0Y/wSmuvo6vvBLK/zRMg6UbFzdIk1wIPxqMyQ1uieRcpPaWVuGti VKoasWs5DxxJgu7V/0u9SOmul3uskDvaEikjNW1IX2rFauJRh4OIVxCtv3t0dvf4rG28 NttrEEnoii/wXtkDe4B4+aeKN61lYo8K4f6hk5yZ59/LkyJ4Dd9SjU2Ez4M5ErCuhgFc wRBfawWRhd1sTOZMaU0PSTeFp+MfoxNVnZrTdkHCyfMrEHP1CWfI3opS9qDQt7iIukiq PMEiW1GwOKWzJ7Tt67O75dtT1w07KyDk2iwPcQK5YEI9EZ9zgiP3MQniCdT71fbUUvhk A+5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b20-v6si10486904pls.78.2018.08.06.13.38.23; Mon, 06 Aug 2018 13:38:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733037AbeHFV4e (ORCPT + 99 others); Mon, 6 Aug 2018 17:56:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:60112 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732945AbeHFV4e (ORCPT ); Mon, 6 Aug 2018 17:56:34 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4B4DEAE0B; Mon, 6 Aug 2018 19:45:55 +0000 (UTC) Date: Mon, 6 Aug 2018 21:45:53 +0200 From: Michal Hocko To: syzbot Cc: cgroups@vger.kernel.org, dvyukov@google.com, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, penguin-kernel@I-love.SAKURA.ne.jp, syzkaller-bugs@googlegroups.com, vdavydov.dev@gmail.com Subject: Re: WARNING in try_charge Message-ID: <20180806194553.GH10003@dhcp22.suse.cz> References: <20180806185554.GG10003@dhcp22.suse.cz> <0000000000006986c30572c90de3@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0000000000006986c30572c90de3@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [CCing Greg - the email thread starts here http://lkml.kernel.org/r/0000000000005e979605729c1564@google.com] On Mon 06-08-18 12:12:02, syzbot wrote: > Hello, > > syzbot has tested the proposed patch and the reproducer did not trigger > crash: OK, this is reassuring. Btw Greg has pointed out this potential case http://lkml.kernel.org/r/xr93in62jy8k.fsf@gthelen.svl.corp.google.com but I simply didn't get what he meant. He was suggesting MMF_OOM_SKIP but I didn't get why that matters. I didn't think about a race. So how about this patch: From 74d980f8d066d06ada657ebf9b586dbf5668ed26 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon, 6 Aug 2018 21:21:24 +0200 Subject: [PATCH] memcg, oom: be careful about races when warning about no reclaimable task "memcg, oom: move out_of_memory back to the charge path" has added a warning triggered when the oom killer cannot find any eligible task and so there is no way to reclaim the oom memcg under its hard limit. Further charges for such a memcg are forced and therefore the hard limit isolation is weakened. The current warning is however too eager to trigger even when we are not really hitting the above condition. Syzbot and Greg Thelen have noticed that we can hit this condition even when there is still oom victim pending. E.g. the following race is possible: memcg has two tasks taskA, taskB. CPU1 (taskA) CPU2 CPU3 (taskB) try_charge mem_cgroup_out_of_memory try_charge select_bad_process(taskB) oom_kill_process oom_reap_task # No real memory reaped mem_cgroup_out_of_memory # set taskB -> MMF_OOM_SKIP # retry charge mem_cgroup_out_of_memory oom_lock oom_lock select_bad_process(self) oom_kill_process(self) oom_unlock # no eligible task In fact syzbot test triggered this situation by placing multiple tasks into a memcg with hard limit set to 0. So no task really had any memory charged to the memcg : Memory cgroup stats for /ile0: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB : Tasks state (memory values in pages): : [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name : [ 6569] 0 6562 9427 1 53248 0 0 syz-executor0 : [ 6576] 0 6576 9426 0 61440 0 0 syz-executor6 : [ 6578] 0 6578 9426 534 61440 0 0 syz-executor4 : [ 6579] 0 6579 9426 0 57344 0 0 syz-executor5 : [ 6582] 0 6582 9426 0 61440 0 0 syz-executor7 : [ 6584] 0 6584 9426 0 57344 0 0 syz-executor1 so in principle there is indeed nothing reclaimable in this memcg and this looks like a misconfiguration. On the other hand we can clearly kill all those tasks so it is a bit early to warn and scare users. Do that by checking that the current is the oom victim and bypass the warning then. The victim is allowed to force charge and terminate to release its temporal charge along the way. Fixes: "memcg, oom: move out_of_memory back to the charge path" Noticed-by: Greg Thelen Reported-and-tested-by: syzbot+bab151e82a4e973fa325@syzkaller.appspotmail.com Signed-off-by: Michal Hocko --- mm/memcontrol.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4603ad75c9a9..1b6eed1bc404 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1703,7 +1703,8 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int return OOM_ASYNC; } - if (mem_cgroup_out_of_memory(memcg, mask, order)) + if (mem_cgroup_out_of_memory(memcg, mask, order) || + tsk_is_oom_victim(current)) return OOM_SUCCESS; WARN(1,"Memory cgroup charge failed because of no reclaimable memory! " -- 2.18.0 -- Michal Hocko SUSE Labs