Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp586863imm; Fri, 22 Jun 2018 01:42:02 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJaxEhzlnrRWgkvuM9hXruHLU518c6YkAq22wZT/l5bgFzDA4UYVp+YVVK2HEzb34YNMdHX X-Received: by 2002:a62:484d:: with SMTP id v74-v6mr756049pfa.256.1529656922426; Fri, 22 Jun 2018 01:42:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529656922; cv=none; d=google.com; s=arc-20160816; b=GdAeT9OXyhRIs2RIrx+fXACZ3vAWm2TRRUO0ivWvNdCWXcbAY0WWuwKrxxr3gbJDYo OZOWfcrj2yhU7Ve1qdXIZPF/VNAHwtoE0ToPHJ7dTPv1XkKjkYFKvmBWveMTB0KjTeND HlseBGVm4idJioKIBAgm80sB+t/fjG/foG2Slmgzosl79DaCMk7eLUBI5lUjU7bXwoKC go0MQlTILX9i0wxBwfZDrXvR2WI2lIZj7RlxLJ0FUMkW8fniD+xL4zS41qchOJzoPVdE Jy6YM1odsuEqTmtBku3UWn0sMNWhJSg3bCZcAILg/F55gtDechjs5ekGX/WeIuXCN1YV RBHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=DZ4a46sl4sD6D0KA69LB3Ah8HpWuWTFQD43/8t3xYc4=; b=pGtznDWl6Ve6yV6UqUl2Yii1+IC84TMtskxfi7jl5JHDOmOFo4wSPrf0O8kFp87Tb1 F13xLPvF5WQVAecqcX2THAK8K9x4hXv6KuxA1h4QpN14acmTVmCrn++z8vxKdP/E0WbD hE0NqaRkBVPAOXEjEaTIaI9FEsHOWcQ9voDhmV1TfzUKy4rBgSN517oO/z1XMOsZ2ciQ DQmXlYbyRu7a/yALhIt4TQ3yirL+eKMQttrnMOLvzIK5THKbB3YF/Yd1/2bTRi/2DzZ1 50nJn8rnMvNZQ3zcXf3Rs7mjsPuv8IUPaMo08ijYSukO8XPfQKWwTvXWXQQAXwz3innW 2oOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ca5-v6si7951562plb.143.2018.06.22.01.41.27; Fri, 22 Jun 2018 01:42:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751315AbeFVIjy (ORCPT + 99 others); Fri, 22 Jun 2018 04:39:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:52756 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934AbeFVIjw (ORCPT ); Fri, 22 Jun 2018 04:39:52 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext-too.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D629FAE7C; Fri, 22 Jun 2018 08:39:50 +0000 (UTC) Date: Fri, 22 Jun 2018 10:39:49 +0200 From: Michal Hocko To: ufo19890607@gmail.com Cc: akpm@linux-foundation.org, rientjes@google.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, penguin-kernel@i-love.sakura.ne.jp, guro@fb.com, yang.s@alibaba-inc.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yuzhoujian@didichuxing.com Subject: Re: [PATCH v9] Refactor part of the oom report in dump_header Message-ID: <20180622083949.GR10465@dhcp22.suse.cz> References: <1529056341-16182-1-git-send-email-ufo19890607@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1529056341-16182-1-git-send-email-ufo19890607@gmail.com> User-Agent: Mutt/1.9.5 (2018-04-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 15-06-18 17:52:21, ufo19890607@gmail.com wrote: > From: yuzhoujian > > Some users complains that system-wide oom report does not print memcg's > name which contains the task killed by the oom-killer. The current system > wide oom report prints the task's command, gfp_mask, order ,oom_score_adj > and shows the memory info, but misses some important information, etc. the > memcg that has reached its limit and the memcg to which the killed process > is attached. We do not print the memcg which reached the limit in the global context because that is irrelevant completely. I do agree that memcg of the oom victim might be interesting and the changelog should explain why. So what about the following wording instead: " The current system wide oom report prints information about the victim and the allocation context and restrictions. It, however, doesn't provide any information about memory cgroup the victim belongs to. This information can be interesting for container users because they can find the victim's container much more easily. " > I follow the advices of David Rientjes and Michal Hocko, and refactor > part of the oom report. After this patch, users can get the memcg's > path from the oom report and check the certain container more quickly. > > The oom print info after this patch: > oom-kill:constraint=,nodemask=,origin_memcg=,kill_memcg=,task=,pid=,uid= [...] > diff --git a/include/linux/oom.h b/include/linux/oom.h > index 6adac113e96d..5bed78d4bfb8 100644 > --- a/include/linux/oom.h > +++ b/include/linux/oom.h > @@ -15,6 +15,20 @@ struct notifier_block; > struct mem_cgroup; > struct task_struct; > > +enum oom_constraint { > + CONSTRAINT_NONE, > + CONSTRAINT_CPUSET, > + CONSTRAINT_MEMORY_POLICY, > + CONSTRAINT_MEMCG, > +}; > + > +static const char * const oom_constraint_text[] = { > + [CONSTRAINT_NONE] = "CONSTRAINT_NONE", > + [CONSTRAINT_CPUSET] = "CONSTRAINT_CPUSET", > + [CONSTRAINT_MEMORY_POLICY] = "CONSTRAINT_MEMORY_POLICY", > + [CONSTRAINT_MEMCG] = "CONSTRAINT_MEMCG", > +}; I've suggested that this should be a separate patch. [...] > -void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p) > +void mem_cgroup_print_oom_context(struct mem_cgroup *memcg, struct task_struct *p, > + enum oom_constraint constraint, nodemask_t *nodemask) > { > - struct mem_cgroup *iter; > - unsigned int i; > + struct cgroup *origin_cgrp, *kill_cgrp; > > rcu_read_lock(); > > + pr_info("oom-kill:constraint=%s,nodemask=%*pbl,origin_memcg=", > + oom_constraint_text[constraint], nodemask_pr_args(nodemask)); > + > + if (memcg) > + pr_cont_cgroup_path(memcg->css.cgroup); > + else > + pr_cont("(null)"); I do not like this. What does origin_memcg=(null) tell you? You really have to know the code to see this is a global oom killer actually. Furthermore I would expect that origin_memcg is tasks' origin memcg rather than oom's origin. So I think you want the following instead pr_info("oom-kill:constraint=%s,nodemask=%*pbl", oom_constraint_text[constraint], nodemask_pr_args(nodemask)); if (memcg) { pr_cont(", oom_memcg="); pr_cont_cgroup_path(memcg->css.cgroup); } if (p) { pr_cont(", task_memcg="); pr_cont_cgroup_path(task_cgroup(p, memory_cgrp_id)); pr_cont(", task=%s, pid=%5d, uid=%5d", p->comm, p->pid, from_kuid(&init_user_ns, task_uid(p))); } pr_cont("\n"); -- Michal Hocko SUSE Labs