Received: by 10.192.165.148 with SMTP id m20csp1201241imm; Wed, 2 May 2018 16:20:51 -0700 (PDT) X-Google-Smtp-Source: AB8JxZojJUybI/MvCJPNz5C9oUD2vgCSZQ4u280KHroT0Bu0RB9VLgGYmW2zzWWu/7vU6UenIOEM X-Received: by 10.98.212.90 with SMTP id u26mr20925060pfl.166.1525303251611; Wed, 02 May 2018 16:20:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525303251; cv=none; d=google.com; s=arc-20160816; b=OW5wsZEfjtC0p2BN7nEW8beXo/qZruTMYQUCzxuPSQHzxBpuY57srDRq7DmA2Q7fSE h5RhlXa/1+sT82/6sLFuesNSZBUb9MjwQ+ypFeKyDYB8B7q3iMvIUwgpE9C6uoxwdD6e k07lYTFU7CdXQ6VTSnCVkEg/xXnDeprQjA5k5Hg/Gr1acYieAxpMhCkSaxcZz1E3T51/ iTzCjLLnfL8f7kW4sFnxQiSSBKzFkFOcaClClyWXXEF+zEGC3D8xN2CIJES6m/wviIy1 VghDeV0a03s9rdTmy7Flw9pBZ24Z2O4C9ge69m1bQIwY7hcgbLqZgGDwclkd2Bn4QLK4 gUQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:to:references :subject:reply-to:dkim-signature:arc-authentication-results; bh=EA2zNhubrAnoJN8JdT8yv1Hpr2meZpdCazt3xT9pMoE=; b=dmfmXH033u8bQeq38F5JU3FxR345yOiwj/APSHb0dycXmhO+GLzGpk9IU8ChmAbNd/ FeA5LIE4fRq27KxEnZHnCoJG+o4U1IJ+Uyf+U9t/8Kvx1Odkoxqagqu7abX6zFbJPKAS LEcnV6/4cgpXoiFRzRYlculAinw7y9pnX+Xn/yZLo+qIu8bBZs9ThFNlkx5o/hj1wAxv UDrDmt1ya1I4E78wIKAmTtrQOq8FXeVTkbqOEVRrd7vs7NQC1kdysUwleE1Bvo7pxN7Z +hGFDZ3htBrnaP26XAdqDMaTtGb+wdSoIls8c0OGKI+29WBkNmlYMKYIanCrZVYpKtwW GxmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=fBfbB1+1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q4-v6si10250038pgn.685.2018.05.02.16.20.05; Wed, 02 May 2018 16:20:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=fBfbB1+1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751709AbeEBXPt (ORCPT + 99 others); Wed, 2 May 2018 19:15:49 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:41594 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751148AbeEBXPq (ORCPT ); Wed, 2 May 2018 19:15:46 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w42N6Qsk111423; Wed, 2 May 2018 23:15:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=reply-to : subject : references : to : cc : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=EA2zNhubrAnoJN8JdT8yv1Hpr2meZpdCazt3xT9pMoE=; b=fBfbB1+1RknRjPvyL9xQw6idzDLEmoU+6fiiTYJ/sNVDDov/Bok1YdE5Th/tuThXl6KV brS50dtE5HdXzwRxcwgQ1h82YSA4jS0eBwVBE2eFUIHf2tNX32ps/GOHEgYRnCV1ssk8 jej1ga0vjzLqTTiVuTlkoG0i62RxSkyIFYdYcprG0FZwYOa//DA9peQXH+Y0sieAO/4e binzb0QTUQdKSJKXxFCu/RvPosEkv6gx5JDPmElOERW0gBAgBfBgEtx/+T5fdPcl8/3K GJq8rgwG2rysJJCWAffLqxtpMxtzzZs9L9N2vIcTMDcaSBIE5crk9YFVNr/6oRXA6Y7J hQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2hmgxfyanw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 02 May 2018 23:15:35 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w42NFYSP025706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 2 May 2018 23:15:34 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w42NFXUt020362; Wed, 2 May 2018 23:15:33 GMT Received: from [10.132.92.130] (/10.132.92.130) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 02 May 2018 16:15:32 -0700 Reply-To: prakash.sangappa@oracle.com Subject: Re: [RFC PATCH] Add /proc//numa_vamaps for numa node information References: <1525240686-13335-1-git-send-email-prakash.sangappa@oracle.com> <20180502143323.1c723ccb509c3497050a2e0a@linux-foundation.org> <2ce01d91-5fba-b1b7-2956-c8cc1853536d@intel.com> To: Dave Hansen , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, mhocko@suse.com, kirill.shutemov@linux.intel.com, n-horiguchi@ah.jp.nec.com, drepper@gmail.com, rientjes@google.com, Naoya Horiguchi From: "prakash.sangappa" Message-ID: <5d2d820b-4a6e-242d-3927-0d693198602a@oracle.com> Date: Wed, 2 May 2018 16:17:41 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <2ce01d91-5fba-b1b7-2956-c8cc1853536d@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8881 signatures=668698 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1805020193 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/02/2018 03:28 PM, Dave Hansen wrote: > On 05/02/2018 02:33 PM, Andrew Morton wrote: >> On Tue, 1 May 2018 22:58:06 -0700 Prakash Sangappa wrote: >>> For analysis purpose it is useful to have numa node information >>> corresponding mapped address ranges of the process. Currently >>> /proc//numa_maps provides list of numa nodes from where pages are >>> allocated per VMA of the process. This is not useful if an user needs to >>> determine which numa node the mapped pages are allocated from for a >>> particular address range. It would have helped if the numa node information >>> presented in /proc//numa_maps was broken down by VA ranges showing the >>> exact numa node from where the pages have been allocated. > I'm finding myself a little lost in figuring out what this does. Today, > numa_maps might us that a 3-page VMA has 1 page from Node 0 and 2 pages > from Node 1. We group *entirely* by VMA: > > 1000-4000 N0=1 N1=2 Yes > > We don't want that. We want to tell exactly where each node's memory is > despite if they are in the same VMA, like this: > > 1000-2000 N1=1 > 2000-3000 N0=1 > 3000-4000 N1=1 > > So that no line of output ever has more than one node's memory. It Yes, that is exactly what this patch will provide. It may not have been clear from the sample output I had included. Here is another snippet from a process. .. 006dc000-006dd000 N1=1 kernelpagesize_kB=4 anon=1 dirty=1 file=/usr/bin/bash 006dd000-006de000 N0=1 kernelpagesize_kB=4 anon=1 dirty=1 file=/usr/bin/bash 006de000-006e0000 N1=2 kernelpagesize_kB=4 anon=2 dirty=2 file=/usr/bin/bash 006e0000-006e6000 N0=6 kernelpagesize_kB=4 anon=6 dirty=6 file=/usr/bin/bash 006e6000-006eb000 N0=5 kernelpagesize_kB=4 anon=5 dirty=5 006eb000-006ec000 N1=1 kernelpagesize_kB=4 anon=1 dirty=1 007f9000-007fa000 N1=1 kernelpagesize_kB=4 anon=1 dirty=1 heap 007fa000-00965000 N0=363 kernelpagesize_kB=4 anon=363 dirty=363 heap 00965000-0096c000 - heap 0096c000-0096d000 N0=1 kernelpagesize_kB=4 anon=1 dirty=1 heap 0096d000-00984000 - heap .. > *appears* in this new file as if each contiguous range of memory from a > given node has its own VMA. Right? No. It just breaks down each VMA of the process into address ranges which have pages on a numa node on each line. i.e Each line will indicate memory from one numa node only. > > This sounds interesting, but I've never found myself wanting this > information a single time that I can recall. I'd love to hear more. > > Is this for debugging? Are apps actually going to *parse* this file? Yes, mainly for debugging/performance analysis . User analyzing can look at this file. Oracle Database team will be using this information. > > How hard did you try to share code with numa_maps? Are you sure we > can't just replace numa_maps? VMAs are a kernel-internal thing and we > never promised to represent them 1:1 in our ABI. I was inclined to just modify numa_maps. However the man page documents numa_maps format to correlate with 'maps' file. Wondering if apps/scripts will break if we change the output of 'numa_maps'. So decided to add a new file instead. I could try to share the code with numa_maps. > > Are we going to continue creating new files in /proc every time a tiny > new niche pops up? :) Wish we could just enhance the existing files.