Received: by 2002:a25:2c96:0:0:0:0:0 with SMTP id s144csp1356601ybs; Mon, 25 May 2020 13:59:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxpheXccA0M1ZXmdc/eYouqiZQDzrc/0tOQs+mQjJsOHfLL86ofZ9l3kg+RLBrGI1sfh7c7 X-Received: by 2002:a17:906:30c3:: with SMTP id b3mr19404140ejb.126.1590440347708; Mon, 25 May 2020 13:59:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590440347; cv=none; d=google.com; s=arc-20160816; b=mNVG9lLTozjpPrbMYdArB7FyMEYc95lH9X1p7kq9DZIdfD+OMxR4v2SbcDfSpM8OXY RJs7a2myT1WzlwNs/oArYapiuiHuarSrZ6VncWW9ShGnmafkO8NeRmMcGwvkw9uggpPm NjL5kAkKVq4RCECZcJy/pT0NlxCC3iU/rkmUE0fdCmnU6Jzr+zB7e/bvEKCuiwswox7V 4N1vHRJrCJXx7orHrRhAp6Hi33nUP0jApepNcyP12JJf1mubCU5zRCcrsA5pLacnPczW tUsqmhOtICwqeYuYG6geEShaNuXR9DGUJiweX+J6Iwwfcly7iL8M/qy9KYP4PyZTGZHw h7tA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=+Wx5/AWc9Y8FAGbaZzHjbam1591EUl99Lphd2VHtxV4=; b=lSyt/r7Bv4s7U9d4jsOFniJqDtsJEPOqFCsziQgPrymNJCUJAiHzlxdZb7hT/kjeWL tVkxRWnxLsnaaid+5rb4ho+zwzPhInXoXLL193tE3lCSmrFh+qzfrVErt/xT5S93Lkea RzAFBLFfQR8D8HO7D72qbq0JlMI2IVdqzE3cjaHWOSIIr2o88xHeJMxONH+C7ISuyD+d Xz5F+ZOWXJURyrbjP9eyandIP18g3ZyZGFnBDirVEfYhNbrcpW5buWSyfUNIJ7lhXUI6 41Qt8hCeVgxVHh580tJYXi2px2XGxqKpwL6gwTpDQdWAVjBeE55QyK8a4gRXEFJGQf3e IawA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Ry64Fww0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y1si9900751edo.418.2020.05.25.13.58.44; Mon, 25 May 2020 13:59:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@yandex-team.ru header.s=default header.b=Ry64Fww0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=yandex-team.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404002AbgEYP61 (ORCPT + 99 others); Mon, 25 May 2020 11:58:27 -0400 Received: from forwardcorp1j.mail.yandex.net ([5.45.199.163]:55722 "EHLO forwardcorp1j.mail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725969AbgEYP61 (ORCPT ); Mon, 25 May 2020 11:58:27 -0400 Received: from mxbackcorp1o.mail.yandex.net (mxbackcorp1o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::301]) by forwardcorp1j.mail.yandex.net (Yandex) with ESMTP id F0E232E14CF; Mon, 25 May 2020 18:58:22 +0300 (MSK) Received: from vla5-58875c36c028.qloud-c.yandex.net (vla5-58875c36c028.qloud-c.yandex.net [2a02:6b8:c18:340b:0:640:5887:5c36]) by mxbackcorp1o.mail.yandex.net (mxbackcorp/Yandex) with ESMTP id IdVU2kUvOZ-wLYqGBqd; Mon, 25 May 2020 18:58:22 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1590422302; bh=+Wx5/AWc9Y8FAGbaZzHjbam1591EUl99Lphd2VHtxV4=; h=In-Reply-To:Message-ID:From:Date:References:To:Subject:Cc; b=Ry64Fww0loGobvdS75y1z+FlRi1IpwX88QXJiP5aUClrQpyTaiClCbrnxR3Z4xW8s /sZDgLi3mRI5WKLYWmHEQZVTu0xnU+fc/oh76oxfspQ3OIsgrEebmSs5ooLzJdAW7g voMlnqIUPzd2almNRcEjSg68n0LLRt+EyRBKZR4Y= Authentication-Results: mxbackcorp1o.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from dynamic-vpn.dhcp.yndx.net (dynamic-vpn.dhcp.yndx.net [2a02:6b8:b081:603::1:c]) by vla5-58875c36c028.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id EzI23gM1tH-wLXSIA4M; Mon, 25 May 2020 18:58:21 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) Subject: Re: [PATCH] mm: dump_page: add debugfs file for dumping page state by pfn To: Vlastimil Babka , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton Cc: "Kirill A. Shutemov" , Joonsoo Kim References: <159041635119.987025.7321864888027213705.stgit@buzz> From: Konstantin Khlebnikov Message-ID: Date: Mon, 25 May 2020 18:58:16 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-CA Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 25/05/2020 18.35, Vlastimil Babka wrote: > On 5/25/20 4:19 PM, Konstantin Khlebnikov wrote: >> Tool 'page-types' could list pages mapped by process or file cache pages, >> but it shows only limited amount of state exported via procfs. >> >> Let's employ existing helper dump_page() to reach remaining information: >> writing pfn into /sys/kernel/debug/dump_page dumps state into kernel log. > > Yeah that's indeed useful, however I'm less sure if kernel log is the proper way > to extract the data. For example IIRC with the page_owner file can "seek to pfn" > to dump it, although that makes it somewhat harder to use. > > Or we could write pfn to one file and read the dump from another one? But that's > not atomic. > > Perhaps if we could do something like "cat /sys/kernel/debug/dump_page/" > without all the pfns being actually listed in the dump_page directory with "ls"? > Is that possible? Too much code for me. =) This could be kind of ftrace tracer which iterates over pages and dumps them, but anyway looks ridiculously overengineered. This one hack connects existing 'pagemap' with existing 'dump_page', so almost free. For complicated cases there is gdb and special tool drgn https://github.com/osandov/drgn Writing script which parses all that stuff from kernel log isn't big deal either. I have one with 100+ lines regexp for all kinds of kernel splats. Will publish when find time for polishing. > >> # echo 0x37c43c > /sys/kernel/debug/dump_page >> # dmesg | tail -6 >> page:ffffcb0b0df10f00 refcount:1 mapcount:0 mapping:000000007755d3d9 index:0x30 >> 0xffffffffae4239e0 name:"libGeoIP.so.1.6.9" >> flags: 0x200000000020014(uptodate|lru|mappedtodisk) >> raw: 0200000000020014 ffffcb0b187fd288 ffffcb0b189e6248 ffff9528a04afe10 >> raw: 0000000000000030 0000000000000000 00000001ffffffff 0000000000000000 >> page dumped because: debugfs request >> >> With CONFIG_PAGE_OWNER=y shows also stacks for last page alloc and free: >> >> page:ffffea0018fff480 refcount:1 mapcount:1 mapping:0000000000000000 index:0x7f9f28f62 >> anon flags: 0x100000000080034(uptodate|lru|active|swapbacked) >> raw: 0100000000080034 ffffea00184140c8 ffffea0018517d88 ffff8886076ba161 >> raw: 00000007f9f28f62 0000000000000000 0000000100000000 ffff888bfc79f000 >> page dumped because: debugfs request >> page->mem_cgroup:ffff888bfc79f000 >> page_owner tracks the page as allocated >> page last allocated via order 0, migratetype Movable, gfp_mask 0x100dca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO) >> prep_new_page+0x139/0x1a0 >> get_page_from_freelist+0xde9/0x14e0 >> __alloc_pages_nodemask+0x18b/0x360 >> alloc_pages_vma+0x7c/0x270 >> __handle_mm_fault+0xd40/0x12b0 >> handle_mm_fault+0xe7/0x1e0 >> do_page_fault+0x2d5/0x610 >> page_fault+0x2f/0x40 >> page last free stack trace: >> free_pcp_prepare+0x11e/0x1c0 >> free_unref_page_list+0x71/0x180 >> release_pages+0x31e/0x480 >> tlb_flush_mmu+0x44/0x150 >> tlb_finish_mmu+0x3d/0x70 >> exit_mmap+0xdd/0x1a0 >> mmput+0x70/0x140 >> do_exit+0x33f/0xc40 >> do_group_exit+0x3a/0xa0 >> __x64_sys_exit_group+0x14/0x20 >> do_syscall_64+0x48/0x130 >> entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> Signed-off-by: Konstantin Khlebnikov >> --- >> Documentation/admin-guide/mm/pagemap.rst | 3 +++ >> Documentation/vm/page_owner.rst | 10 ++++++++++ >> mm/debug.c | 27 +++++++++++++++++++++++++++ >> 3 files changed, 40 insertions(+) >> >> diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst >> index 340a5aee9b80..663ad5490d72 100644 >> --- a/Documentation/admin-guide/mm/pagemap.rst >> +++ b/Documentation/admin-guide/mm/pagemap.rst >> @@ -205,3 +205,6 @@ Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is >> always 12 at most architectures). Since Linux 3.11 their meaning changes >> after first clear of soft-dirty bits. Since Linux 4.2 they are used for >> flags unconditionally. >> + >> +Page state could be dumped into kernel log by writing pfn in text form >> +into /sys/kernel/debug/dump_page. >> diff --git a/Documentation/vm/page_owner.rst b/Documentation/vm/page_owner.rst >> index 0ed5ab8c7ab4..d4d4dc64c19d 100644 >> --- a/Documentation/vm/page_owner.rst >> +++ b/Documentation/vm/page_owner.rst >> @@ -88,3 +88,13 @@ Usage >> >> See the result about who allocated each page >> in the ``sorted_page_owner.txt``. >> + >> +Notes >> +===== >> + >> +To lookup pages in file cache or mapped in process you could use interface >> +pagemap documented in Documentation/admin-guide/mm/pagemap.rst or tool >> +page-types in the tools/vm directory. >> + >> +Page state could be dumped into kernel log by writing pfn in text form >> +into /sys/kernel/debug/dump_page. >> diff --git a/mm/debug.c b/mm/debug.c >> index 2189357f0987..5803f2b63d95 100644 >> --- a/mm/debug.c >> +++ b/mm/debug.c >> @@ -14,6 +14,7 @@ >> #include >> #include >> #include >> +#include >> >> #include "internal.h" >> >> @@ -147,6 +148,32 @@ void dump_page(struct page *page, const char *reason) >> } >> EXPORT_SYMBOL(dump_page); >> >> +#ifdef CONFIG_DEBUG_FS >> +static int dump_page_set(void *data, u64 pfn) >> +{ >> + struct page *page; >> + >> + if (!capable(CAP_SYS_ADMIN)) >> + return -EPERM; >> + >> + page = pfn_to_online_page(pfn); >> + if (!page) >> + return -ENXIO; >> + >> + dump_page(page, "debugfs request"); >> + return 0; >> +} >> +DEFINE_DEBUGFS_ATTRIBUTE(dump_page_fops, NULL, dump_page_set, "%llx\n"); >> + >> +static int __init dump_page_debugfs(void) >> +{ >> + debugfs_create_file_unsafe("dump_page", 0200, NULL, NULL, >> + &dump_page_fops); >> + return 0; >> +} >> +late_initcall(dump_page_debugfs); >> +#endif /* CONFIG_DEBUG_FS */ >> + >> #ifdef CONFIG_DEBUG_VM >> >> void dump_vma(const struct vm_area_struct *vma) >> >