Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp4270868ybe; Mon, 16 Sep 2019 09:20:27 -0700 (PDT) X-Google-Smtp-Source: APXvYqzsHm042nOew8K0eNpg3DQGj+C6LSJwC5nlYmCSDBwaEiBfiiql0pip/Zn4fuvmLgAZhISA X-Received: by 2002:a50:f782:: with SMTP id h2mr76718edn.225.1568650827687; Mon, 16 Sep 2019 09:20:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568650827; cv=none; d=google.com; s=arc-20160816; b=RG3jxebxPz/VRj1/tPg7FR6bL1wAUK7pkHNvZRdtjrSKOjJuxJ8XSAmAB0bBcKVaE2 wmfsTXG6CWFLtWVnZgdtfJlhf5ok+poPWZ9ENFx7dmJwnBzUghsibsh5G3ehVoftBMrW gAraZF7L8Syew1XLJ8o7wJMb5fnBqZ+Ld9JdsKsNUAVd54Gy59f7uAzwl1uOHCZ7tV9Y w8rnFnHXqBKRoqwZJDTWmjvRguRxYmCWxlo3UpJ0cvetedFb9KnIHEYZeheorCOxaq3v P+9y58VJAPcxIzMvuS3qrz6SGpiDhoeU1OHUoyG+spXBgtkbo4M75FO9GGiqMRdDy1oD aQzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=w2Mb/sak1V7XuiC/zKC5/OvIf2dX6UspfQysWAqx1wA=; b=fDdNTQvtPkvaMxkc2PTyqmZhn7qLiCJRiyw4wP7/PihORR+WGKj8PXEuRZ67k8yRuk Gdl3XJlwztIqQr+L6pyKLWSMM32TwlOOkzYqkxIOeuOkyPBLszwU1PRU+PAirE2GjJcr aq+16s81qrdsd6SRI4Koh5lDaH/H28YyIQUs+bVvpKCyRfBRcL9yh9Csx+mQj1AX2C0j nnf3II3ea+OUMX4+HgDCjELXnoB2B+Dd5k2FIvceEBAjg5Hsny2fj2mTKUGtSw0djZai u0u0Jd2hBZTzlKX69+2ruHAK4jD5ZCifdbDSqECQ0DZNI32SpJcVAH7mqMDyqOC/RW2k 8u/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y10si14381297edp.248.2019.09.16.09.20.03; Mon, 16 Sep 2019 09:20:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731470AbfIPLff (ORCPT + 99 others); Mon, 16 Sep 2019 07:35:35 -0400 Received: from mx2.suse.de ([195.135.220.15]:39450 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727479AbfIPLff (ORCPT ); Mon, 16 Sep 2019 07:35:35 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E888CAF11; Mon, 16 Sep 2019 11:35:32 +0000 (UTC) Date: Mon, 16 Sep 2019 13:35:32 +0200 From: Michal Hocko To: Lucian Adrian Grijincu Cc: linux-mm@kvack.org, Souptick Joarder , linux-kernel@vger.kernel.org, Andrew Morton , Rik van Riel , Roman Gushchin , Hugh Dickins Subject: Re: [PATCH v3] mm: memory: fix /proc/meminfo reporting for MLOCK_ONFAULT Message-ID: <20190916113532.GE10231@dhcp22.suse.cz> References: <20190913211119.416168-1-lucian@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190913211119.416168-1-lucian@fb.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [Cc Hugh] On Fri 13-09-19 14:11:19, Lucian Adrian Grijincu wrote: > As pages are faulted in MLOCK_ONFAULT correctly updates > /proc/self/smaps, but doesn't update /proc/meminfo's Mlocked field. > > - Before this /proc/meminfo fields didn't change as pages were faulted in: > > = Start = > /proc/meminfo > Unevictable: 10128 kB > Mlocked: 10132 kB > = Creating testfile = > > = after mlock2(MLOCK_ONFAULT) = > /proc/meminfo > Unevictable: 10128 kB > Mlocked: 10132 kB > /proc/self/smaps > 7f8714000000-7f8754000000 rw-s 00000000 08:04 50857050 /root/testfile > Locked: 0 kB > > = after reading half of the file = > /proc/meminfo > Unevictable: 10128 kB > Mlocked: 10132 kB > /proc/self/smaps > 7f8714000000-7f8754000000 rw-s 00000000 08:04 50857050 /root/testfile > Locked: 524288 kB > > = after reading the entire the file = > /proc/meminfo > Unevictable: 10128 kB > Mlocked: 10132 kB > /proc/self/smaps > 7f8714000000-7f8754000000 rw-s 00000000 08:04 50857050 /root/testfile > Locked: 1048576 kB > > = after munmap = > /proc/meminfo > Unevictable: 10128 kB > Mlocked: 10132 kB > /proc/self/smaps > > - After: /proc/meminfo fields are properly updated as pages are touched: > > = Start = > /proc/meminfo > Unevictable: 60 kB > Mlocked: 60 kB > = Creating testfile = > > = after mlock2(MLOCK_ONFAULT) = > /proc/meminfo > Unevictable: 60 kB > Mlocked: 60 kB > /proc/self/smaps > 7f2b9c600000-7f2bdc600000 rw-s 00000000 08:04 63045798 /root/testfile > Locked: 0 kB > > = after reading half of the file = > /proc/meminfo > Unevictable: 524220 kB > Mlocked: 524220 kB > /proc/self/smaps > 7f2b9c600000-7f2bdc600000 rw-s 00000000 08:04 63045798 /root/testfile > Locked: 524288 kB > > = after reading the entire the file = > /proc/meminfo > Unevictable: 1048496 kB > Mlocked: 1048508 kB > /proc/self/smaps > 7f2b9c600000-7f2bdc600000 rw-s 00000000 08:04 63045798 /root/testfile > Locked: 1048576 kB > > = after munmap = > /proc/meminfo > Unevictable: 176 kB > Mlocked: 60 kB > /proc/self/smaps > > Repro code. > --- > > int mlock2wrap(const void* addr, size_t len, int flags) { > return syscall(SYS_mlock2, addr, len, flags); > } > > void smaps() { > char smapscmd[1000]; > snprintf( > smapscmd, > sizeof(smapscmd) - 1, > "grep testfile -A 20 /proc/%d/smaps | grep -E '(testfile|Locked)'", > getpid()); > printf("/proc/self/smaps\n"); > fflush(stdout); > system(smapscmd); > } > > void meminfo() { > const char* meminfocmd = "grep -E '(Mlocked|Unevictable)' /proc/meminfo"; > printf("/proc/meminfo\n"); > fflush(stdout); > system(meminfocmd); > } > > { \ > int rc = (call); \ > if (rc != 0) { \ > printf("error %d %s\n", rc, strerror(errno)); \ > exit(1); \ > } \ > } > int main(int argc, char* argv[]) { > printf("= Start =\n"); > meminfo(); > > printf("= Creating testfile =\n"); > size_t size = 1 << 30; // 1 GiB > int fd = open("testfile", O_CREAT | O_RDWR, 0666); > { > void* buf = malloc(size); > write(fd, buf, size); > free(buf); > } > int ret = 0; > void* addr = NULL; > addr = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); > > if (argc > 1) { > PCHECK(mlock2wrap(addr, size, MLOCK_ONFAULT)); > printf("= after mlock2(MLOCK_ONFAULT) =\n"); > meminfo(); > smaps(); > > for (size_t i = 0; i < size / 2; i += 4096) { > ret += ((char*)addr)[i]; > } > printf("= after reading half of the file =\n"); > meminfo(); > smaps(); > > for (size_t i = 0; i < size; i += 4096) { > ret += ((char*)addr)[i]; > } > printf("= after reading the entire the file =\n"); > meminfo(); > smaps(); > > } else { > PCHECK(mlock(addr, size)); > printf("= after mlock =\n"); > meminfo(); > smaps(); > } > > PCHECK(munmap(addr, size)); > printf("= after munmap =\n"); > meminfo(); > smaps(); > > return ret; > } > > --- > > Signed-off-by: Lucian Adrian Grijincu > Acked-by: Souptick Joarder Fixes: b0f205c2a308 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage") I am not really sure a backport to stable is really needed because an imprecise accounting is not really critical. Pages should eventually get accounted under memory pressure when they are attempted to unmap IIRC. Btw. the changelog could benefit from a more details on the issue and the fix description. The reproducer is really nice but it doesn't really explain the maze of the mlock accounting and why only the file backed memory has a problem. > --- > mm/memory.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index e0c232fe81d9..55da24f33bc4 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3311,6 +3311,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg, > } else { > inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); > page_add_file_rmap(page, false); > + if (vma->vm_flags & VM_LOCKED && !PageTransCompound(page)) > + mlock_vma_page(page); > } > set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); I dunno. Handling it here in alloc_set_pte sounds a bit weird to me. Altough we already do mlock for CoW pages there, I thought this was more of an exception. Is there any real reason why this cannot be done in the standard #PF path? finish_fault for example? -- Michal Hocko SUSE Labs