Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp7072959pxv; Fri, 30 Jul 2021 09:22:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyOyREijVYKXAwd7J3P0w9P7pIsB9+W2wD0mmOup2+QH9+WqhB6lLMTKL4kCidStlOFczPU X-Received: by 2002:a05:6e02:528:: with SMTP id h8mr2134520ils.223.1627662163911; Fri, 30 Jul 2021 09:22:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1627662163; cv=none; d=google.com; s=arc-20160816; b=fyGooOQ5LX506Fc9Ddo32D35uWZ4T09Yx00vrlj6G1BI12rB2NxdL96OPCdVniMvK4 kEeeJrqM1PjaxHLKwtZ33a6pnd8Af6QuilweTS8J2Whb/6Bfcraba/qEvs3aJBTO4twt gp1IJoRvGViEO2QOJSfC0DxOvJ4Kis35iEhev6X8znOVoEuQPolDd5Vkh46LbV2q13o/ OSqsHVMLBO6ApkjbZ+3MBSSa3UFlV6mnwmF3IMe++7xwpq1czUf8KT0DPSQcCGud8CNI beO5er6IzeKK4ElXxtMTcP4/Exc6yPuJSKiCScohMLJJxWOZWMYcrtiEfhvAXebpZzuH vSSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=6sxbrTXcrTd5pHC9K2KxbmQLHh8u1jEYeqbDBwTE+r4=; b=CPy5EDCzZ4Hk818fZvJxvwuJXrTDq2f0CKjTX5n+/4V+mchEwL8pVj8D6Hb5lS8EbP mR3DdV38mAdeHpztoPhjLtoJRgClnAd6fJEs/Oltn1E+bduBzkTUgTB5Fe2b57gGwCah /5AGYe6IaKIb6Jv4KAZRpvxI9EEvPbJlkYKJ9MGNKIAdvIho+RifZSFH6VrF3b5Npgp7 9Okp6yWUvDgTaEBBqHuop9v4elGNp0rSBq6B7EnsYvi9r/rwqkDB7hZo0xFIUy4qWJ+s KA/sca75kqzEHiUPH1haK6zn1E5lvqBk74U7paetk5oBRe3XB/Kv8oTD8JPyijjM+JeX UW0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hlRYVNkN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h17si1928499jav.110.2021.07.30.09.22.31; Fri, 30 Jul 2021 09:22:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hlRYVNkN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229521AbhG3QUS (ORCPT + 99 others); Fri, 30 Jul 2021 12:20:18 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:42695 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229479AbhG3QUO (ORCPT ); Fri, 30 Jul 2021 12:20:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1627662007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=6sxbrTXcrTd5pHC9K2KxbmQLHh8u1jEYeqbDBwTE+r4=; b=hlRYVNkNN82UU8Pg+hSQAkvoeSMZ0IT0CXenEn07lNFID8W+4d7BZzdRanBrSjcJ8zf9k8 l8Z2ufu3l3Dv4pY9o3hDTbWFLR5GW3fmHIbirY3V7xdrKpa6ULy/4jfYkRAtS1HgEAEYr7 08LCTUPF6h6eOifCGGdYXkH6C4581Po= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-18-8InSzytvM_W1vVtx5MT0NQ-1; Fri, 30 Jul 2021 12:20:06 -0400 X-MC-Unique: 8InSzytvM_W1vVtx5MT0NQ-1 Received: by mail-wm1-f69.google.com with SMTP id d72-20020a1c1d4b0000b029025164ff3ebfso3343727wmd.7 for ; Fri, 30 Jul 2021 09:20:05 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=6sxbrTXcrTd5pHC9K2KxbmQLHh8u1jEYeqbDBwTE+r4=; b=OMOt+ZzIMXifgHz/MWSrL75N+Pkz78PiaiRWpsn8vNyAmF26fHJsMP1JygCSHg1Tkh yhfQytcmeBDSWZ46Uzb80OOJUVYQssx0ydYUYoTnJklmuDofRHl1DGQvmcglvmtTIhXW 5IIpZCeYNGxP7DLDxeYh3HZs5i8T70/FpeSQrOitv8DbrZWOqpkFeCENg1rp4AuhtlhT Up862v7YeKnN5+IPOgdMQqcphRAo3A4WVeHu8MfDXh+VWOljoGFAnzrbJNVsP9jx4WhU AX5G2H99iGn5mcVt2wZno3AxfDmVKgNz+AuZ/rcD2Xik+TavHF0iT27zCmO17D4HrfNU KsCQ== X-Gm-Message-State: AOAM53264vIOGObLjTERMVuMtkQD10BujdP0fylQkpskkL01XKYJ1iw5 l+NHJsfCLRaPNnpSRxipocmp+GpNuNb425jElGkpodvhsoVfNsFOFEsTSEWhqiDqnduhtsNoXCw pj9VeYM8qQCFADHj2MaigGac= X-Received: by 2002:adf:8b86:: with SMTP id o6mr86921wra.116.1627662004390; Fri, 30 Jul 2021 09:20:04 -0700 (PDT) X-Received: by 2002:adf:8b86:: with SMTP id o6mr86898wra.116.1627662004124; Fri, 30 Jul 2021 09:20:04 -0700 (PDT) Received: from localhost (cpc111743-lutn13-2-0-cust979.9-3.cable.virginm.net. [82.17.115.212]) by smtp.gmail.com with ESMTPSA id b15sm2619315wrr.27.2021.07.30.09.20.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Jul 2021 09:20:02 -0700 (PDT) From: Aaron Tomlin To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, mhocko@suse.com, penguin-kernel@i-love.sakura.ne.jp, rientjes@google.com, llong@redhat.com, neelx@redhat.com, linux-kernel@vger.kernel.org Subject: [PATCH v3] mm/oom_kill: show oom eligibility when displaying the current memory state of all tasks Date: Fri, 30 Jul 2021 17:20:02 +0100 Message-Id: <20210730162002.279678-1-atomlin@redhat.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Changes since v2: - Use single character (e.g. 'R' for MMF_OOM_SKIP) as suggested by Tetsuo Handa - Add new header to oom_dump_tasks documentation - Provide further justification The output generated by dump_tasks() can be helpful to determine why there was an OOM condition and which rogue task potentially caused it. Please note that this is only provided when sysctl oom_dump_tasks is enabled. At the present time, when showing potential OOM victims, we do not exclude any task that are not OOM eligible e.g. those that have MMF_OOM_SKIP set; it is possible that the last OOM killable victim was already OOM killed, yet the OOM reaper failed to reclaim memory and set MMF_OOM_SKIP. This can be confusing (or perhaps even be misleading) to the viewer. Now, we already unconditionally display a task's oom_score_adj_min value that can be set to OOM_SCORE_ADJ_MIN which is indicative of an "unkillable" task. This patch provides a clear indication with regard to the OOM ineligibility (and why) of each displayed task with the addition of a new column namely "oom_skipped". An example is provided below: [ 5084.524970] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name [ 5084.526397] [660417] 0 660417 35869 683 167936 0 -1000 M conmon [ 5084.526400] [660452] 0 660452 175834 472 86016 0 -998 pod [ 5084.527460] [752415] 0 752415 35869 650 172032 0 -1000 M conmon [ 5084.527462] [752575] 1001050000 752575 184205 11158 700416 0 999 npm [ 5084.527467] [753606] 1001050000 753606 183380 46843 2134016 0 999 node [ 5084.527581] Memory cgroup out of memory: Killed process 753606 (node) total-vm:733520kB, anon-rss:161228kB, file-rss:26144kB, shmem-rss:0kB, UID:1001050000 So, a single character 'M' is for OOM_SCORE_ADJ_MIN, 'R' MMF_OOM_SKIP and 'V' for in_vfork(). Signed-off-by: Aaron Tomlin --- Documentation/admin-guide/sysctl/vm.rst | 5 ++-- mm/oom_kill.c | 31 +++++++++++++++++++++---- 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 003d5cc3751b..4c79fa00ddb3 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -650,8 +650,9 @@ oom_dump_tasks Enables a system-wide task dump (excluding kernel threads) to be produced when the kernel performs an OOM-killing and includes such information as pid, uid, tgid, vm size, rss, pgtables_bytes, swapents, oom_score_adj -score, and name. This is helpful to determine why the OOM killer was -invoked, to identify the rogue task that caused it, and to determine why +score, oom eligibility status and name. This is helpful to determine why +the OOM killer was invoked, to identify the rogue task that caused it, and +to determine why the OOM killer chose the task it did to kill. If this is set to zero, this information is suppressed. On very diff --git a/mm/oom_kill.c b/mm/oom_kill.c index c729a4c4a1ac..36daa6917b62 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -160,6 +160,27 @@ static inline bool is_sysrq_oom(struct oom_control *oc) return oc->order == -1; } +/** + * is_task_eligible_oom - determine if and why a task cannot be OOM killed + * @tsk: task to check + * + * Needs to be called with task_lock(). + */ +static const char * const is_task_oom_eligible(struct task_struct *p) +{ + long adj; + + adj = (long)p->signal->oom_score_adj; + if (adj == OOM_SCORE_ADJ_MIN) + return "M"; + else if (test_bit(MMF_OOM_SKIP, &p->mm->flags) + return "R"; + else if (in_vfork(p)) + return "V"; + else + return ""; +} + /* return true if the task is not adequate as candidate victim task. */ static bool oom_unkillable_task(struct task_struct *p) { @@ -401,12 +422,13 @@ static int dump_task(struct task_struct *p, void *arg) return 0; } - pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", + pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %1s %s\n", task->pid, from_kuid(&init_user_ns, task_uid(task)), task->tgid, task->mm->total_vm, get_mm_rss(task->mm), mm_pgtables_bytes(task->mm), get_mm_counter(task->mm, MM_SWAPENTS), - task->signal->oom_score_adj, task->comm); + task->signal->oom_score_adj, is_task_oom_eligible(task), + task->comm); task_unlock(task); return 0; @@ -420,12 +442,13 @@ static int dump_task(struct task_struct *p, void *arg) * memcg, not in the same cpuset, or bound to a disjoint set of mempolicy nodes * are not shown. * State information includes task's pid, uid, tgid, vm size, rss, - * pgtables_bytes, swapents, oom_score_adj value, and name. + * pgtables_bytes, swapents, oom_score_adj value, oom eligibility status + * and name. */ static void dump_tasks(struct oom_control *oc) { pr_info("Tasks state (memory values in pages):\n"); - pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); + pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj oom_skipped name\n"); if (is_memcg_oom(oc)) mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); -- 2.31.1