Received: by 10.223.185.116 with SMTP id b49csp1761111wrg; Thu, 15 Feb 2018 00:50:02 -0800 (PST) X-Google-Smtp-Source: AH8x227HLmqGkXodJl/cxVvcT+PkQIYGYIA/X/LFh9Llo3E4rNxfw03K7mhDhBjdYxTPdjERVX/U X-Received: by 10.99.156.17 with SMTP id f17mr1647506pge.12.1518684602452; Thu, 15 Feb 2018 00:50:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518684602; cv=none; d=google.com; s=arc-20160816; b=E9Z+km7vnNPfHGB4aKeP1Byv1z7hg9c929Bwq0yS0vlRxcFuL8XilP3qWs6w8BY6YG O9q+oAn3tgdgZdXg9Psr8bhBGkE92fVavn0RV7zG2JzNPIe9gKHWLnflKR2gEKhbgeoS oADG7UlKbN29p2NlG/cfF3JLYJqgKT/t7EprKElb8G6a9NfA01wdg9yG0ToNcVzdQmq5 2fGu26DL0Emgh8R2hDe1cq+3MWVI48R0YNJ2tvS18/rpN1QX3OABaZ4cgheYw86STK7S IaGcyBNYV2eLic/LrW/k2JeKuLtXbHsqAVwlg/TpajnkjbGQ+7pUW2YpUsmdwjTwhi2s CARQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:subject:cc:to:from:date:arc-authentication-results; bh=AaHewmY7/4yuA3KfR5234wL7QRMV7hxNEfSkKEBdQHg=; b=Qj3bUtO/8zDc19D503qpHbfc1r+GeiCHkGOaJIsAbM9OzHvmDpQdeZmEaJcl8lKx2W 9ocQZk0fWngFBdf9YlPO99amjIu3gGJ6TJQjsPeC/qTau4XZxuHXWRCruu33cPu1QePG w+ByBWTEc9CZSd5LFigrjwMu2OR3oobkdsW+JXTUKi5KtisPGkLNfLRK2XUwfGoRDitQ U1avn7BXnO/leS1SWqcwNX3OXUoAbMzdVjAZ1IqY9FoKM3qLTIbOk5cJr1y6ekjKvtBY THat8Y8sGEtNdtJua2h701De6VzqptgUnZhXEcY9SRIrtDeGxNZ+UOlNdwMTxkjLRpKt 7lng== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 68-v6si3065594pla.767.2018.02.15.00.49.47; Thu, 15 Feb 2018 00:50:02 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755286AbeBOIs6 convert rfc822-to-8bit (ORCPT + 99 others); Thu, 15 Feb 2018 03:48:58 -0500 Received: from smtprelay.restena.lu ([158.64.1.62]:48340 "EHLO smtprelay.restena.lu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755206AbeBOIs4 (ORCPT ); Thu, 15 Feb 2018 03:48:56 -0500 Received: from pluto.restena.lu (pluto.restena.lu [IPv6:2001:a18:1:8::156]) by smtprelay.restena.lu (Postfix) with ESMTPS id 3C49343968; Thu, 15 Feb 2018 09:40:24 +0100 (CET) Date: Thu, 15 Feb 2018 09:40:24 +0100 From: Bruno =?UTF-8?B?UHLDqW1vbnQ=?= To: cgroups@vger.kernel.org Cc: Tejun Heo , Johannes Weiner , Michal Hocko , Vladimir Davydov , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: CGroup-v2: Memory: current impossibly high for empty cgroup Message-ID: <20180215094024.5fc6161a@pluto.restena.lu> X-Mailer: Claws Mail 3.15.0-dirty (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org With 4.15.2 kernel I'm hitting a state where a given leaf v2 cgroup is considering itself as permanently over-limit and OOM-kill any process I try to move into it (it's currently empty!) I can't hand out a simple reproducer right now, but it seems during accounting the counter went "negative". CGroupv2 is mounted on /sys/fs/cgroup/hosting/ and the leaf-cgroup is /sys/fs/cgroup/hosting/websrv/: cgroup.controllers:cpu io memory pids cgroup.events:populated 0 cgroup.max.depth:max cgroup.max.descendants:max cgroup.stat:nr_descendants 0 cgroup.stat:nr_dying_descendants 0 cgroup.type:domain cpu.max:max 100000 cpu.stat:usage_usec 48830679 cpu.stat:user_usec 25448590 cpu.stat:system_usec 23382088 cpu.stat:nr_periods 0 cpu.stat:nr_throttled 0 cpu.stat:throttled_usec 0 cpu.weight:100 cpu.weight.nice:0 io.bfq.weight:100 io.stat:8:0 rbytes=143360 wbytes=32768 rios=24 wios=7 io.stat:7:0 rbytes=51687424 wbytes=0 rios=12619 wios=0 io.weight:default 100 memory.current:18446744073694965760 memory.events:low 0 memory.events:high 0 memory.events:max 114 memory.events:oom 15 memory.events:oom_kill 8 memory.high:8589934592 memory.low:4294967296 memory.max:17179869184 memory.stat:anon 0 memory.stat:file 0 memory.stat:kernel_stack 0 memory.stat:slab 593920 memory.stat:sock 0 memory.stat:shmem 0 memory.stat:file_mapped 0 memory.stat:file_dirty 0 memory.stat:file_writeback 0 memory.stat:inactive_anon 0 memory.stat:active_anon 0 memory.stat:inactive_file 0 memory.stat:active_file 0 memory.stat:unevictable 0 memory.stat:slab_reclaimable 204800 memory.stat:slab_unreclaimable 389120 memory.stat:pgfault 80983 memory.stat:pgmajfault 259 memory.stat:pgrefill 115 memory.stat:pgscan 384 memory.stat:pgsteal 263 memory.stat:pgactivate 100 memory.stat:pgdeactivate 115 memory.stat:pglazyfree 0 memory.stat:pglazyfreed 0 memory.stat:workingset_refault 0 memory.stat:workingset_activate 0 memory.stat:workingset_nodereclaim 3 pids.current:0 pids.events:max 0 pids.max:max cgroup.procs is empty. Possibly interesting content in kernel log: [580391.746367] WARNING: CPU: 1 PID: 24354 at /data/kernel/linux-4.15/mm/page_counter.c:27 page_counter_cancel+0x10/0x20 [580391.746498] Modules linked in: floppy [580391.746551] CPU: 1 PID: 24354 Comm: image-starter Not tainted 4.15.2-x86_64-vmware #2 [580391.746612] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1508272355 08/27/2015 [580391.746716] RIP: 0010:page_counter_cancel+0x10/0x20 [580391.746767] RSP: 0018:ffffad4f8475fb08 EFLAGS: 00010293 [580391.746819] RAX: 0000000000000005 RBX: ffffad4f8475fb38 RCX: 000000000000000e [580391.746877] RDX: ffffa058b6a51148 RSI: 000000000000000e RDI: ffffa058b6a51148 [580391.746935] RBP: 000000000000000e R08: 0000000000022b30 R09: 0000000000000c4c [580391.746993] R10: 0000000000000004 R11: ffffffffffffffff R12: ffffad4f8475fb38 [580391.747051] R13: ffffffffbbc68df8 R14: ffffad4f8475fd30 R15: 0000000000000001 [580391.747117] FS: 00007fd5e3b84740(0000) GS:ffffa058bdd00000(0000) knlGS:0000000000000000 [580391.747178] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [580391.747233] CR2: 00000000016fa3e8 CR3: 0000000005c0a001 CR4: 00000000000606a0 [580391.747344] Call Trace: [580391.747400] page_counter_uncharge+0x16/0x20 [580391.747450] uncharge_batch+0x2e/0x150 [580391.747503] mem_cgroup_uncharge_list+0x54/0x60 [580391.747555] release_pages+0x2f6/0x330 [580391.747609] __pagevec_release+0x25/0x30 [580391.747659] truncate_inode_pages_range+0x251/0x6e0 [580391.747712] ? write_cache_pages+0x311/0x350 [580391.747763] __blkdev_put+0x6f/0x1e0 [580391.747815] deactivate_locked_super+0x2a/0x60 [580391.747870] cleanup_mnt+0x40/0x60 [580391.747921] task_work_run+0x7b/0xa0 [580391.747975] do_exit+0x38d/0x960 [580391.749543] do_group_exit+0x99/0xa0 [580391.750525] SyS_exit_group+0xb/0x10 [580391.751457] do_syscall_64+0x74/0x120 [580391.752427] entry_SYSCALL_64_after_hwframe+0x21/0x86 [580391.753370] RIP: 0033:0x7fd5e3463529 [580391.754290] RSP: 002b:00007ffdaad73478 EFLAGS: 00000206 ORIG_RAX: 00000000000000e7 [580391.755205] RAX: ffffffffffffffda RBX: 000000002d220011 RCX: 00007fd5e3463529 [580391.756119] RDX: 0000000000000000 RSI: 00007ffdaad732dc RDI: 0000000000000000 [580391.758421] RBP: 0000000000000000 R08: 000000000000003c R09: 00000000000000e7 [580391.759590] R10: ffffffffffffff80 R11: 0000000000000206 R12: 00007ffdaad72480 [580391.760499] R13: 0000000000001000 R14: 00000000016f8080 R15: 0000000000000000 [580391.761366] Code: eb 07 e8 f4 26 fb ff eb d0 48 c7 c7 00 40 c3 bb e8 46 50 44 00 89 d8 5b 5d c3 90 48 89 f0 48 f7 d8 f0 48 0f c1 07 48 39 f0 79 02 <0f> ff c3 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 eb 19 48 89 f0 [580391.763136] ---[ end trace 60c773f283acdb6f ]--- [580393.195186] image-starter invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0 [580393.199401] image-starter cpuset=/ mems_allowed=0 [580393.201491] CPU: 1 PID: 28447 Comm: image-starter Tainted: G W 4.15.2-x86_64-vmware #2 [580393.202899] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.0.B64.1508272355 08/27/2015 [580393.202901] Call Trace: [580393.202918] dump_stack+0x5c/0x85 [580393.202927] dump_header+0x5a/0x233 [580393.202933] oom_kill_process+0x91/0x3e0 [580393.202936] out_of_memory+0x221/0x240 [580393.202940] mem_cgroup_out_of_memory+0x36/0x50 [580393.202942] mem_cgroup_oom_synchronize+0x1ff/0x300 [580393.202945] ? __mem_cgroup_insert_exceeded+0x90/0x90 [580393.202947] pagefault_out_of_memory+0x1f/0x4d [580393.202951] __do_page_fault+0x323/0x390 [580393.202957] ? page_fault+0x36/0x60 [580393.202959] page_fault+0x4c/0x60 [580393.202964] RIP: 0033:0x7f9edd31c919 [580393.202966] RSP: 002b:00007ffd818f4290 EFLAGS: 00010246 [580393.202968] Task in /websrv killed as a result of limit of /websrv [580393.202972] memory: usage 18446744073709537436kB, limit 16777216kB, failcnt 44 [580393.202973] memory+swap: usage 0kB, limit 9007199254740988kB, failcnt 0 [580393.202975] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 [580393.202976] Memory cgroup stats for /websrv: cache:16KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:16KB inactive_anon:0KB active_anon:0KB inactive_file:8KB active_file:8KB unevictable:0KB [580393.202983] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [580393.203001] [28447] 0 28447 2143 31 57344 0 0 image-starter [580393.203004] [28448] 0 28448 2143 34 57344 0 0 image-starter [580393.203006] [28449] 0 28449 2143 34 57344 0 0 image-starter [580393.203008] [28450] 0 28450 2143 34 57344 0 0 image-starter [580393.203010] [28451] 0 28451 2143 34 57344 0 0 image-starter [580393.203012] Memory cgroup out of memory: Kill process 28449 (image-starter) score 0 or sacrifice child [580393.203026] Killed process 28449 (image-starter) total-vm:8572kB, anon-rss:136kB, file-rss:0kB, shmem-rss:0kB ... (other processes being OOM-killed) image-starter is a small kind of container managing tool that makes use of namespaces and assigns namespaced processes to corresponding cgroups (pertinent cgroup directories are bind-mounted into the mount namespace). The probable memory-counter-underflow seems to have happened either on stop of container (likely) or on new start of it. Possibly of interest, some mmaps inside cgroup did fail during previous run of container due to RLIMIT_DATA. As this is a test system I can happily perform debugging and experimentation on it (I'm currently working at moving from v1 cgroups to v2 cgroups). Cheers, Bruno