Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755859Ab3G3NJQ (ORCPT ); Tue, 30 Jul 2013 09:09:16 -0400 Received: from mail-ob0-f182.google.com ([209.85.214.182]:60448 "EHLO mail-ob0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755316Ab3G3NJJ (ORCPT ); Tue, 30 Jul 2013 09:09:09 -0400 MIME-Version: 1.0 In-Reply-To: <20130730090202.GL3008@twins.programming.kicks-ass.net> References: <20130625104700.GZ28407@twins.programming.kicks-ass.net> <20130625105123.GA13649@gmail.com> <20130626103303.GB28407@twins.programming.kicks-ass.net> <20130628095828.GG29209@dyad.programming.kicks-ass.net> <20130708081919.GV23916@twins.programming.kicks-ass.net> <20130730083758.GH3008@twins.programming.kicks-ass.net> <20130730090202.GL3008@twins.programming.kicks-ass.net> Date: Tue, 30 Jul 2013 15:09:08 +0200 Message-ID: Subject: Re: [PATCH 0/8] perf: add ability to sample physical data addresses From: Stephane Eranian To: Peter Zijlstra Cc: Ingo Molnar , LKML , "mingo@elte.hu" , "ak@linux.intel.com" , Arnaldo Carvalho de Melo , Jiri Olsa , Namhyung Kim Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3436 Lines: 84 On Tue, Jul 30, 2013 at 11:02 AM, Peter Zijlstra wrote: > On Tue, Jul 30, 2013 at 10:51:46AM +0200, Stephane Eranian wrote: >> On Tue, Jul 30, 2013 at 10:37 AM, Peter Zijlstra wrote: >> > On Tue, Jul 30, 2013 at 10:02:01AM +0200, Stephane Eranian wrote: >> >> > Ahh. We don't put the useful bits in the mmap event; we'll need to fix >> >> > that too then ;-) >> >> > >> >> > Doing so is going to be a bit of a bother since we use the tail of >> >> > PERF_RECORD_MMAP for filenames and thus aren't particularly extensible. >> >> > >> >> > This would mean doing something like PERF_RECORD_MMAP2 and some means >> >> > for userspace to requrest the new events instead of the old one. >> >> > >> >> Tracking mmaps even for shmat() won't cover the paging cases. When you page a >> >> page back in, it most likely gets a different physical page. How would >> >> we track that >> >> case too using the same approach? >> > >> > It doesn't matter. Even if a page ends up being a different physical >> > page, it will always be the same sb:inode:pgoffset. You should be able >> > to always uniquely identify a (shared) page by that triplet. >> > >> Ok, so you're saying that triplet uniquely identifies a virtual page >> regardless of >> the physical page it is mapped onto. If the physical page changes because >> of paging, we keep the same triplet and therefore we can still detect the false >> sharing. > > Exactly. > I see this for my program: 7f0a59cbe000-7f0a59cc1000 rw-p 00000000 00:00 0 7f0a59cd3000-7f0a59cd4000 rw-p 00000000 00:00 0 7f0a59cd4000-7f0a59cd5000 rw-s 00000000 00:04 458753 /SYSV00000000 (deleted) 7f0a59cd5000-7f0a59cd6000 rw-s 00000000 00:04 425984 /SYSV00000000 (deleted) 7f0a59cd6000-7f0a59cd7000 rw-s 00000000 00:04 425984 /SYSV00000000 (deleted) The first 2 lines are heap. There is nothing useful coming out of maj:min ino. However for shared segment we can use the ino number. Shared memory segment appear as file in the vma therefore, the kernel does use the ino, maj, min number. And in my program I map the same segment twice, and we see the last two mappings are identical. But in the case of regular paging, there is no useful info there. But thenI suspect for a private heap page we only care about multi-threaded and there the physical page is irrelevant. So it seems all we care about is to cover the shared segment case and we can get the info from the vma and creates a MMAP2 record for it. Do we agree? >> > So if we create a net MMAP record that includes the device (substitute >> > for the superblock) and inode information we should be good. >> >> I will try that. I am not familiar with mm, so where do we find the >> device? Inside >> the vma? > > Take a peek at fs/proc/task_mmu.c:show_map_vma(), its the code used to > print /proc/$PID/maps and displays all stuff we want. That is what I see in that function: if (file) { struct inode *inode = file_inode(vma->vm_file); dev = inode->i_sb->s_dev; ino = inode->i_ino; pgoff = ((loff_t)vma->vm_pgoff) << PAGE_SHIFT; } It works for anything associated with a file. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/