Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58561C433F5 for ; Wed, 24 Nov 2021 12:23:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244361AbhKXM0J (ORCPT ); Wed, 24 Nov 2021 07:26:09 -0500 Received: from mail.kernel.org ([198.145.29.99]:38292 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244542AbhKXMXj (ORCPT ); Wed, 24 Nov 2021 07:23:39 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1BF4E60FE7; Wed, 24 Nov 2021 12:14:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1637756062; bh=yODLhEdXW/hxuMxhwUWe/t5YzrfJquNFLeQ2MRlZOr0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OMkDHDqI6kIz95czYu7HcW7d50gDEAsorkpz/BYXZrMqOEcVzjsHGt2YHvKtS253V /3RHLrMxMh06nIqFyAsAOjSTf9IMkaVJIafTa4NtOIHzrZ5S/KtLj8bK7t4FUmYi0z BoUCHztRqnidnKFWphgs14Kfp3BY/0nc7ej/uMsw= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Vasily Averin , Michal Hocko , Johannes Weiner , Mel Gorman , Roman Gushchin , Shakeel Butt , Tetsuo Handa , Uladzislau Rezki , Vladimir Davydov , Vlastimil Babka , Andrew Morton , Linus Torvalds Subject: [PATCH 4.9 156/207] mm, oom: pagefault_out_of_memory: dont force global OOM for dying tasks Date: Wed, 24 Nov 2021 12:57:07 +0100 Message-Id: <20211124115709.060806757@linuxfoundation.org> X-Mailer: git-send-email 2.34.0 In-Reply-To: <20211124115703.941380739@linuxfoundation.org> References: <20211124115703.941380739@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Vasily Averin commit 0b28179a6138a5edd9d82ad2687c05b3773c387b upstream. Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. Memory cgroup charging allows killed or exiting tasks to exceed the hard limit. It can be misused and allowed to trigger global OOM from inside a memcg-limited container. On the other hand if memcg fails allocation, called from inside #PF handler it triggers global OOM from inside pagefault_out_of_memory(). To prevent these problems this patchset: (a) removes execution of out_of_memory() from pagefault_out_of_memory(), becasue nobody can explain why it is necessary. (b) allow memcg to fail allocation of dying/killed tasks. This patch (of 3): Any allocation failure during the #PF path will return with VM_FAULT_OOM which in turn results in pagefault_out_of_memory which in turn executes out_out_memory() and can kill a random task. An allocation might fail when the current task is the oom victim and there are no memory reserves left. The OOM killer is already handled at the page allocator level for the global OOM and at the charging level for the memcg one. Both have much more information about the scope of allocation/charge request. This means that either the OOM killer has been invoked properly and didn't lead to the allocation success or it has been skipped because it couldn't have been invoked. In both cases triggering it from here is pointless and even harmful. It makes much more sense to let the killed task die rather than to wake up an eternally hungry oom-killer and send him to choose a fatter victim for breakfast. Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com Signed-off-by: Vasily Averin Suggested-by: Michal Hocko Acked-by: Michal Hocko Cc: Johannes Weiner Cc: Mel Gorman Cc: Roman Gushchin Cc: Shakeel Butt Cc: Tetsuo Handa Cc: Uladzislau Rezki Cc: Vladimir Davydov Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/oom_kill.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1095,6 +1095,9 @@ void pagefault_out_of_memory(void) if (mem_cgroup_oom_synchronize(true)) return; + if (fatal_signal_pending(current)) + return; + if (!mutex_trylock(&oom_lock)) return; out_of_memory(&oc);