Received: by 2002:a25:e7d8:0:0:0:0:0 with SMTP id e207csp4060791ybh; Tue, 17 Mar 2020 11:28:13 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsLtCa+8ORnedFCnjyOzM6BpNMMMXZnZ7HZK2luLKXLjBMsJehzdPEk0CsD4O5WiqDpN36T X-Received: by 2002:a9d:720a:: with SMTP id u10mr455611otj.177.1584469693420; Tue, 17 Mar 2020 11:28:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584469693; cv=none; d=google.com; s=arc-20160816; b=lp7EkI9H1Xhe5EkajGXEbu8H1j6C9iIYrYhIkpSCm8Rw3rYmknBGOcoL9F09JgKhBs 287gq38juKkPuvOhDAklA693GwiRReAnf9jYr7I4bB2UKklSjwb9jyyXspOqsrNwNOJi T2KTODKURGU8nMPmBC/ViI8ed/XKuMU1/yDfgesy5UdNz67wotbT6mTsPizhpODd8ZWu 6u4FZAXhYdb3K6ATqcwZLX3ONfanPdVAR3OOBq/s676kfswcakL4k+PFxau5ytAN+l/H IylwBI61NsW0OpwP40cj2fv4salqQBZqiBriCEqIjAa5J0+PFMGXsoOIGn5M5V13ddSs sIAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=Fyw/0w4c822Iz8zK/btqki5MNWC2a7i6nqG8rTs3u5M=; b=bK26E+e3q5b+0JbgOAaosk3nnRrGxuu9fw2z/TweX98imTAc8gRbvR2z4Jorlttulu FgFTRYzzFlDL4Wf8yhS/psp//drIaV/rKeOYP1l2jGafv8W42GUoNHGp3VfB5Xhbt0at HrfSdXPcy3nuydvkVw7QXdfFN/APOAElgMqec1kepiRlNNfo48nPe+b7Sgw3NOQG20yh CyjuCprdJXkQfk+tHovAMwJvIiYtFFMLp3cSsVMaYuWdw0BecfAkm/P25xYoopIXlnAN sivBh7hhAal0FiXQicjGM7u9YkPkm0elhSK1osm8RXbx7Na6hqgJOMInS8HNBAoyGua3 xBAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ia3p2hdK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q23si2148716otg.271.2020.03.17.11.28.00; Tue, 17 Mar 2020 11:28:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ia3p2hdK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726575AbgCQS0F (ORCPT + 99 others); Tue, 17 Mar 2020 14:26:05 -0400 Received: from mail-io1-f45.google.com ([209.85.166.45]:45247 "EHLO mail-io1-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726287AbgCQS0F (ORCPT ); Tue, 17 Mar 2020 14:26:05 -0400 Received: by mail-io1-f45.google.com with SMTP id w7so6740202ioj.12 for ; Tue, 17 Mar 2020 11:26:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Fyw/0w4c822Iz8zK/btqki5MNWC2a7i6nqG8rTs3u5M=; b=ia3p2hdK/bH2uAcKuAtM7Vy5dqhj2+AYw5LfWCnQvi47yfSN8Qq3N/LJzxzq8efZxv oRY/Z4WABXRwFzLP6QndyyhvjUrECiBJ+th+ofyqm4dmeY1BgAKvEzPaomlU/rpTYXOA PIVq2rBnbG2V+SM3t9G6g+8gzOR6bVGb5NN/A/xgCc7pobdG/3YCo8Uz2Dp+ApzQk6bU Fh5Cg31ZeMUosMb2tbzeEYZNDWEeLyK/9cdFHSRGavB6zEuo+dhz3Ung5ZBCOe6UuqfC FPwtBcZdKoUG/BPDU//j5wUSVKVUS9nv0X+LjlILaMAcV9IjGeFxFe0A93ciOBGGL5FR KCeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Fyw/0w4c822Iz8zK/btqki5MNWC2a7i6nqG8rTs3u5M=; b=lS1Bkf1oiozA/DzEgg9MTVLx/6Zt6vyVpeSF9SiYhjtv4kyHJaNZvPJV9KCzX/KfRV daUFYZKUtf5ZLqjCAmi3mLQPy2A2dRjmA7wCsezKg3i54Bnvl1ZgwuWi4LqQdh6IfL8J wGEP75uE+seG6RgnprDBunKSnPX0edC7tq45NBkFFAmhoDYV2MnxQrzEsiz28LBBCnQ+ aj/h5jUE/3WwqHYehJCgpMpZRpWtGamqFNskaD0c22X6mjgKQAJfBkq0bufYLCPosFdJ K2VDRebVXQUM0xFlr19IoLt1PE20WwcIm0VLUPhdeuJcSB4uNyLBQePGO9eUad87jeSO uKkg== X-Gm-Message-State: ANhLgQ27EoHlArO2yNIvZ9rpnMjLBJ79NxL6ePZncrnP4qC/PsC9Qnj9 RQi3fsodAfRDEPvFZUVgDSTABt0AFxO6VNcOKLmL X-Received: by 2002:a02:304a:: with SMTP id q71mr607149jaq.123.1584469563999; Tue, 17 Mar 2020 11:26:03 -0700 (PDT) MIME-Version: 1.0 References: <20200310221938.GF8447@dhcp22.suse.cz> In-Reply-To: From: Robert Kolchmeyer Date: Tue, 17 Mar 2020 11:25:52 -0700 Message-ID: Subject: Re: [patch] mm, oom: make a last minute check to prevent unnecessary memcg oom kills To: David Rientjes Cc: Michal Hocko , Andrew Morton , Vlastimil Babka , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Ami Fischman Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 10, 2020 at 3:54 PM David Rientjes wrote: > > Robert, could you elaborate on the user-visible effects of this issue that > caused it to initially get reported? > Ami (now cc'ed) knows more, but here is my understanding. The use case involves a Docker container running multiple processes. The container has a memory limit set. The container contains two long-lived, important processes p1 and p2, and some arbitrary, dynamic number of usually ephemeral processes p3,...,pn. These processes are structured in a hierarchy that looks like p1->p2->[p3,...,pn]; p1 is a parent of p2, and p2 is the parent for all of the ephemeral processes p3,...,pn. Since p1 and p2 are long-lived and important, the user does not want p1 and p2 to be oom-killed. However, p3,...,pn are expected to use a lot of memory, and it's ok for those processes to be oom-killed. If the user sets oom_score_adj on p1 and p2 to make them very unlikely to be oom-killed, p3,...,pn will inherit the oom_score_adj value, which is bad. Additionally, setting oom_score_adj on p3,...,pn is tricky, since processes in the Docker container (specifically p1 and p2) don't have permissions to set oom_score_adj on p3,...,pn. The ephemeral nature of p3,...,pn also makes setting oom_score_adj on them tricky after they launch. So, the user hopes that when one of p3,...,pn triggers an oom condition in the Docker container, the oom killer will almost always kill processes from p3,...,pn (and not kill p1 or p2, which are both important and unlikely to trigger an oom condition). The issue of more processes being killed than are strictly necessary is resulting in p1 or p2 being killed much more frequently when one of p3,...,pn triggers an oom condition, and p1 or p2 being killed is very disruptive for the user (my understanding is that p1 or p2 going down with high frequency results in significant unhealthiness in the user's service). The change proposed here has not been run in a production system, and so I don't think anyone has data that conclusively demonstrates that this change will solve the user's problem. But, from observations made in their production system, the user is confident that addressing this aggressive oom killing will solve their problem, and we have data that shows this change does considerably reduce the frequency of aggressive oom killing (from 61/100 oom killing events down to 0/100 with this change). Hope this gives a bit more context. Thanks, -Robert