Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp414270pxb; Wed, 3 Mar 2021 06:32:55 -0800 (PST) X-Google-Smtp-Source: ABdhPJwse3AmezxlUZFjzV7qZOxOX68m7F2HNFbQdgxuJRb21BI8QmbptC/fGpiLtghB3NpxYN6w X-Received: by 2002:a17:906:38d2:: with SMTP id r18mr18714446ejd.104.1614781975000; Wed, 03 Mar 2021 06:32:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614781974; cv=none; d=google.com; s=arc-20160816; b=liPkUubWl7kyqjp6jkXATqhoMXBZxCGmmDZkaalU0Luv49gI5+bLBA/AfZF6GawCx/ s5S9lhcFBbfOBVq2s1e1KO0x6GhAbLAjMwLl6/kUZwuEEbnuKtxwtPY0AFnsEwkVHEF0 s4AcfL4/cBwlwt6iI/yPAlJSOXUHYkEH7u3ZO+ip/c6uZ7z53gTfBaK3Vu4cJGAqWYYl dKaJhu4DXQuEVYKQ8T7RmV4qxD6I0cLjxNcDAa8OXt/8RNAamaOtl2S+i+jnf31+tSYu DfoAzWOmMTaEGaTZFOf156LwXDR0FIG9Gtxd3AxAxa6IgemLDjXxEGlgE5NX+uFSYRwF 9AoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=vv6t5g8xmRQrU3t4nwEI1adaYgpKEkMqFy3Pe4ng6DE=; b=xx8hMlYfmDAXHqlPbMF4HJPaXW1/v7KDlWNxdHcK0yHWzOelVoL3e8hD0cbSFzlscU R9M00jfOljTUNFCbb0zr5b6Lq7m0lvolBfywTykz8YplJVrkyM/PH9L3R2hup7HM4ZkT Lvb0BK34OPYkhXfjCc41I+Z9An0bnkTNIhznYI5B4IRNL0Rnju9A+RcUQonXyedpubcU Rx920uIHR3waqf7+vvDjTnXfutcHqCLu7KiTIUGcjSZCFuw2r5NfKLDKrTEgamisxCYo Bd19RjiTMErL62uRsp3zdMNMRFacBRr8rKiAudQ6rUXi2xsWDeoA4EwK2qY7QbXsWFDN GjrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LG5630hx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m7si6853828edd.486.2021.03.03.06.31.12; Wed, 03 Mar 2021 06:32:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=LG5630hx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244197AbhCAVcy (ORCPT + 99 others); Mon, 1 Mar 2021 16:32:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235320AbhCARTA (ORCPT ); Mon, 1 Mar 2021 12:19:00 -0500 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4C49C061756 for ; Mon, 1 Mar 2021 09:18:03 -0800 (PST) Received: by mail-ej1-x62b.google.com with SMTP id hs11so29648077ejc.1 for ; Mon, 01 Mar 2021 09:18:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vv6t5g8xmRQrU3t4nwEI1adaYgpKEkMqFy3Pe4ng6DE=; b=LG5630hxJ/RU1Vhwdq9zzm9Y94EvZi4itREzN4F+bHXi9ZMDLn5fk/NhUoOeDQ92Yd RaaVkDDZrgitJ1gYmrJY2fzvTGU7Yce18FWzKGE3qN7xR6jbrM075E9XrvWw6uxbcugF tOuTJAs4Cdn9QeZeeWqMo+NCDRdTcnAy4OpLY8JEZBNV+vBRxgxIdxxbVB+/NyeSOkAK dqR4Tysj/lpoLQFDfgLSJX5NO7k4WclYROWTo7zIW2vc9YNr81iWx4/QiQF5yzt8iErt uvMQz0b5eomR+5ZominYN7ZcTEn8/+v9o7CVF6au5HruHQC0wkWXdZoGyqpQOefiwUI9 2Frg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vv6t5g8xmRQrU3t4nwEI1adaYgpKEkMqFy3Pe4ng6DE=; b=hFuit7mLros+Ck7WIZcKa9WR6U0+VlxgT2wsytDRtb4gnwI2af45i/HMIaScmkqm98 JowbQVR68ak53pXy97Z622sMrymf+o6i0sa3Xad8evzepcFeMV0PMubSixzy+Lm/HuhI Lvm3CIN4Xxub7mFqkjDb+yihntVJcTsL6d/LhgJPXE7RCeAPMVOBSv3W+HDY0RkvwaUZ tONf5NLUF4Vmrv18VcHodxkzkE3d9bY7+NlZxGgTmHi7S9TaqNWSyJyO6vToMcBc6V9A S6MTSTbjUnWAXPPqamjrN2ASBEDqwwxYjCU8jYKATH//Cr2mxE7BkSPcXFLGAVAi5bF/ SK3A== X-Gm-Message-State: AOAM530kt61OECLRgThZDlyPWy6UEvtdBP0BJ223HEv6FGRiPMaRUlpY DBZLjC/4Z05n0xuljMQ+YoZqgG4c9oG9uIJpxOo= X-Received: by 2002:a17:907:2bf6:: with SMTP id gv54mr17343031ejc.514.1614619082651; Mon, 01 Mar 2021 09:18:02 -0800 (PST) MIME-Version: 1.0 References: <20210226021254.3980-1-shy828301@gmail.com> In-Reply-To: From: Yang Shi Date: Mon, 1 Mar 2021 09:17:50 -0800 Message-ID: Subject: Re: [PATCH] doc: memcontrol: add description for oom_kill To: Michal Hocko Cc: Johannes Weiner , Roman Gushchin , Shakeel Butt , Andrew Morton , Jonathan Corbet , Linux MM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 1, 2021 at 4:24 AM Michal Hocko wrote: > > On Fri 26-02-21 11:19:51, Yang Shi wrote: > > On Fri, Feb 26, 2021 at 8:42 AM Yang Shi wrote: > > > > > > On Thu, Feb 25, 2021 at 11:30 PM Michal Hocko wrote: > > > > > > > > On Thu 25-02-21 18:12:54, Yang Shi wrote: > > > > > When debugging an oom issue, I found the oom_kill counter of memcg is > > > > > confusing. At the first glance without checking document, I thought it > > > > > just counts for memcg oom, but it turns out it counts both global and > > > > > memcg oom. > > > > > > > > Yes, this is the case indeed. The point of the counter was to count oom > > > > victims from the memcg rather than matching that to the source of the > > > > oom. Rememeber that this could have been a memcg oom up in the > > > > hierarchy as well. Counting victims on the oom origin could be equally > > > > > > Yes, it is updated hierarchically on v2, but not on v1. I'm supposed > > > this is because v1 may work in non-hierarchcal mode? If this is the > > > only reason we may be able to remove this to get aligned with v2 since > > > non-hierarchal mode is no longer supported. > > > > BTW, having the counter recorded hierarchically may help out one of > > our usecases. We want to monitor the oom_kill for some services, but > > systemd would wipe out the cgroup if the service is oom killed then > > restart the service from scratch (it means create a brand new cgroup > > with the same name). So this systemd behavior makes the counter > > useless if it is not recorded hierarchically. > > Just to make sure I understand correctly. You have a setup where memcg > for a service has a hard limit configured and it is destroyed when oom > happens inside that memcg. A new instance is created at the same place > of the hierarchy with a new memcg. Your problem is that the oom killed > memcg will not be recorded in its parent oom event and the information > will get lost with the torn down memcg. Correct? Yes. But global oom instead of memcg oom. > > If yes then how do you tell which of the child cgroup was killed from > the parent counter? Or is there only a single child? Not only a single child, but our case is that oom-killed child consumes 90% memory, then global oom would kill it. This definitely doesn't prevent from accounting oom from other children, but we don't have to have a very accurate counter and in our case we can tell 99% oom kill happens with that specific memcg. > > Anyway, cgroup v2 will offer the hierarchical behavior. Do you have any > strong reasons that you cannot use v2? I do prefer to migrate to cgroup v2 personally. But it incurs significant work for orchestration tools, infrastructure configuration, monitoring tools, etc which are out of my control. > -- > Michal Hocko > SUSE Labs