Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp445524ybh; Wed, 18 Mar 2020 02:55:52 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtqOHw1QJx4NiDT0ALWg9/n8PIpryh2I5MPYVfBeYQo26SXCVXCmmUDpgSDxTSUIEnisvRP X-Received: by 2002:a9d:65ca:: with SMTP id z10mr3333251oth.244.1584525352657; Wed, 18 Mar 2020 02:55:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584525352; cv=none; d=google.com; s=arc-20160816; b=ta8M3ILWXNkhP7Db4RqoMotQqxDQEu4XbmDBoM4KHboeOr7hfmtyuK9Q0G8aFTftHf +Jr3ZhWg1NEdZL7rzMYJggCrDbz2ZwiY0r5R+tzpoGwESMydrii6OHoLd7HXJefVPWLP m6k/fs0TYxJFq6aFhLjY2xox3YrJHv0w1oVtk4EcAJVvTwniIs0HKkigM46SawOAy+HA xWcan61/S36EzJ2njjVtbG+vcCpvBfrlKLRW9AWLIA32znELrtENeoP4575kvHvLgDru L0OKltMgI2reg+bJDq7rFGaIqONqE5Jm9GqC3PGRabczjvJ2N1LDJ7hh234DSqW+/Xtd 5ypQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=h44VEghhnCpVrg8TDJ73z2V7JCZi3jSE0gcmzyWYyfw=; b=FmVbRuIFSoTH/5yvRRbslrE1mQxCUs9kHC0QnL3XjGWeabRKsifRAK9W+iebeduobt bEQXpXg54SwwFIjSnk8KSg2/F2Er0bsXug82xNHNJ/SIRWX4f1lHkNfH1lmAOJJxRFFi FPTD1usIxI06gB2/DEziLsC8r3ey2snKCkC4jDKmkGBM5jSclBQ8ojHdhqFk++Eo6Ysf P9rOmRmjO23chhOIdCskK/3BAR6imON6zdaG/mKQ3Br6t2pvUk6Sf9Mn/HUNdPs8MYN0 fS8tAGk16NZ6Kl+O9FXAC26SvczyeekJXTDnxrtP8hcF4l7MIhQ5iIKcNGsGt94Uge4T XXqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a21si3825689otk.277.2020.03.18.02.55.40; Wed, 18 Mar 2020 02:55:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727564AbgCRJzU (ORCPT + 99 others); Wed, 18 Mar 2020 05:55:20 -0400 Received: from mail-wr1-f66.google.com ([209.85.221.66]:38216 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727405AbgCRJzU (ORCPT ); Wed, 18 Mar 2020 05:55:20 -0400 Received: by mail-wr1-f66.google.com with SMTP id s1so7706979wrv.5 for ; Wed, 18 Mar 2020 02:55:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=h44VEghhnCpVrg8TDJ73z2V7JCZi3jSE0gcmzyWYyfw=; b=Ns3cwG/SZAEByHqX2YtSUREbNmvn2ixBh9egXpGZj+Zn7J3ZtJl0DqoQAl43h2Nu0W UwqiVZbdE04k3eM/1rDKyWTOasvQ3c1CcqtImqBHMwoWjY4c8SqlhauDv9xviWKZnK+c 1rPJvQcj9MpJz0yMnvjKBYAKwoK54xCl96k0JQYjUmDuye7iTDghP1dUbcnQgZkUJawi LQ8lc0PkbVwjyfsY8Ecs20XMOoKsDDbvkyacLc5GWrPc4hrliavPa8O5ULLrnBf1mN9J +yF19gI1/lCiZkYUJyPyxhAouRyWKxF3+ldbnBIlRvlLYKpP3/gSmwZ4HSR0ikNUk+OZ 19Nw== X-Gm-Message-State: ANhLgQ3v/H/HPaW/jVRVRYqH1ijG1mBMvZ1NlV8taxFW1mERAxJYROyu 7ysrJLPaJ+HMtPNiZM42BN+HKPBh X-Received: by 2002:adf:df82:: with SMTP id z2mr4474873wrl.46.1584525317004; Wed, 18 Mar 2020 02:55:17 -0700 (PDT) Received: from localhost (ip-37-188-180-89.eurotel.cz. [37.188.180.89]) by smtp.gmail.com with ESMTPSA id 195sm1952050wmb.8.2020.03.18.02.55.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Mar 2020 02:55:16 -0700 (PDT) Date: Wed, 18 Mar 2020 10:55:14 +0100 From: Michal Hocko To: Robert Kolchmeyer Cc: David Rientjes , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ami Fischman Subject: Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills Message-ID: <20200318095514.GF21362@dhcp22.suse.cz> References: <20200310221938.GF8447@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 17-03-20 11:25:52, Robert Kolchmeyer wrote: > On Tue, Mar 10, 2020 at 3:54 PM David Rientjes wrote: > > > > Robert, could you elaborate on the user-visible effects of this issue that > > caused it to initially get reported? > > > > Ami (now cc'ed) knows more, but here is my understanding. The use case > involves a Docker container running multiple processes. The container > has a memory limit set. The container contains two long-lived, > important processes p1 and p2, and some arbitrary, dynamic number of > usually ephemeral processes p3,...,pn. These processes are structured > in a hierarchy that looks like p1->p2->[p3,...,pn]; p1 is a parent of > p2, and p2 is the parent for all of the ephemeral processes p3,...,pn. > > Since p1 and p2 are long-lived and important, the user does not want > p1 and p2 to be oom-killed. However, p3,...,pn are expected to use a > lot of memory, and it's ok for those processes to be oom-killed. > > If the user sets oom_score_adj on p1 and p2 to make them very unlikely > to be oom-killed, p3,...,pn will inherit the oom_score_adj value, > which is bad. Additionally, setting oom_score_adj on p3,...,pn is > tricky, since processes in the Docker container (specifically p1 and > p2) don't have permissions to set oom_score_adj on p3,...,pn. The > ephemeral nature of p3,...,pn also makes setting oom_score_adj on them > tricky after they launch. Thanks for the clarification. > So, the user hopes that when one of p3,...,pn triggers an oom > condition in the Docker container, the oom killer will almost always > kill processes from p3,...,pn (and not kill p1 or p2, which are both > important and unlikely to trigger an oom condition). The issue of more > processes being killed than are strictly necessary is resulting in p1 > or p2 being killed much more frequently when one of p3,...,pn triggers > an oom condition, and p1 or p2 being killed is very disruptive for the > user (my understanding is that p1 or p2 going down with high frequency > results in significant unhealthiness in the user's service). Do you have any logs showing this condition? I am interested because from your description it seems like p1/p2 shouldn't be usually those which trigger the oom, right? That suggests that it should be mostly p3, ... pn to be in the kernel triggering the oom and therefore they shouldn't vanish. -- Michal Hocko SUSE Labs