Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753824AbdCONwa (ORCPT ); Wed, 15 Mar 2017 09:52:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38576 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752315AbdCONvE (ORCPT ); Wed, 15 Mar 2017 09:51:04 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com C27BB635E2 Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=acme@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com C27BB635E2 Date: Wed, 15 Mar 2017 10:50:59 -0300 From: Arnaldo Carvalho de Melo To: Stephane Eranian Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar , Jiri Olsa , namhyung.kim@kernel.org, acme@kernel.org Subject: Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Message-ID: <20170315135059.GC2177@redhat.com> References: <1489561041-19778-1-git-send-email-eranian@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1489561041-19778-1-git-send-email-eranian@google.com> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.5.20 (2009-12-10) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 15 Mar 2017 13:51:04 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6935 Lines: 182 Em Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian escreveu: > This patch significantly improves the execution time of > perf_event__synthesize_mmap_events() when running perf record > on systems where processes have lots of threads. It just happens > that cat /proc/pid/maps support uses a O(N^2) algorithm to generate > each map line in the maps file. If you have 1000 threads, then you have > necessarily 1000 stacks. For each vma, you need to check if it corresponds > to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. > > As of today, perf does not use the fact that a mapping is a stack, > therefore we can work around the issue by using /proc/pid/tasks/pid/maps. > This entry does not try to map a vma to stack and is thus much > faster with no loss of functonality. > > The proc-map-timeout logic is kept in case user still want some uppre limit. > > Signed-off-by: Stephane Eranian > --- > tools/perf/util/event.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c > index 4ea7ce7..b137566 100644 > --- a/tools/perf/util/event.c > +++ b/tools/perf/util/event.c > @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, > if (machine__is_default_guest(machine)) > return 0; > > - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", > - machine->root_dir, pid); > + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", > + machine->root_dir, pid, pid); Humm... [root@jouet ~]# ps -C firefox -L PID LWP TTY TIME CMD 25023 25023 tty2 01:32:10 firefox 25023 25048 tty2 00:00:00 gmain 25023 25049 tty2 00:00:00 gdbus 25023 25056 tty2 00:00:00 Gecko_IOThread 25023 25058 tty2 00:00:00 Link Monitor 25023 25059 tty2 00:03:52 Socket Thread 25023 25060 tty2 00:00:00 JS Watchdog 25023 25061 tty2 00:00:12 JS Helper 25023 25062 tty2 00:00:12 JS Helper 25023 25063 tty2 00:00:13 JS Helper 25023 25064 tty2 00:00:12 JS Helper 25023 25065 tty2 00:00:13 JS Helper 25023 25066 tty2 00:00:12 JS Helper 25023 25067 tty2 00:00:13 JS Helper 25023 25068 tty2 00:00:13 JS Helper 25023 25069 tty2 00:00:00 Hang Monitor 25023 25070 tty2 00:00:00 BgHangManager 25023 25117 tty2 00:00:12 Cache2 I/O 25023 25118 tty2 00:05:37 Timer 25023 25146 tty2 00:00:00 DataStorage 25023 25150 tty2 00:00:00 GMPThread 25023 25152 tty2 00:04:12 Compositor 25023 25156 tty2 00:00:02 ImgDecoder #1 25023 25157 tty2 00:00:02 ImgDecoder #2 25023 25158 tty2 00:00:02 ImgDecoder #3 25023 25159 tty2 00:00:00 ImageIO 25023 25160 tty2 00:02:26 SoftwareVsyncTh 25023 25202 tty2 00:00:00 HTML5 Parser 25023 25204 tty2 00:00:00 IPDL Background 25023 25212 tty2 00:00:22 DOM Worker 25023 25219 tty2 00:00:23 URL Classifier 25023 25220 tty2 00:00:05 ImageBridgeChil 25023 25221 tty2 00:00:04 mozStorage #1 25023 25222 tty2 00:00:00 Proxy R~olution 25023 25223 tty2 00:00:00 DataStorage 25023 25234 tty2 00:00:00 Cache I/O 25023 25235 tty2 00:00:01 mozStorage #2 25023 25236 tty2 00:00:00 mozStorage #3 25023 25252 tty2 00:00:20 DOM Worker 25023 25256 tty2 00:00:00 mozStorage #4 25023 25257 tty2 00:00:00 localStorage DB 25023 25260 tty2 00:00:00 mozStorage #5 25023 2066 tty2 00:00:19 DOM Worker 25023 5988 tty2 00:00:00 threaded-ml 25023 6757 tty2 00:00:00 firefox 25023 6999 tty2 00:00:00 mozStorage #6 25023 19239 tty2 00:00:02 mozStorage #7 25023 12038 tty2 00:00:05 threaded-ml 25023 19301 tty2 00:00:00 firefox [root@jouet ~]# [root@jouet ~]# file /proc/25023/tasks /proc/25023/tasks: cannot open `/proc/25023/tasks' (No such file or directory) [root@jouet ~]# But... [root@jouet ~]# file /proc/25023/task /proc/25023/task: directory [root@jouet ~]# wc -l /proc/25023/task/25023/maps 1200 /proc/25023/task/25023/maps [root@jouet ~]# wc -l /proc/25023/task/25048/maps 1201 /proc/25023/task/25048/maps [root@jouet ~]# wc -l /proc/25048/task/25048/maps 1201 /proc/25048/task/25048/maps [root@jouet ~]# [acme@jouet linux]$ grep \"tasks\" fs/proc/*.c [acme@jouet linux]$ grep \"task\" fs/proc/*.c fs/proc/base.c: DIR("task", S_IRUGO|S_IXUGO, proc_task_inode_operations, proc_task_operations), fs/proc/base.c: name.name = "task"; [acme@jouet linux]$ These end up mapping to the later of these two: const struct file_operations proc_pid_maps_operations = { .open = pid_maps_open, .read = seq_read, .llseek = seq_lseek, .release = proc_map_release, }; const struct file_operations proc_tid_maps_operations = { .open = tid_maps_open, .read = seq_read, .llseek = seq_lseek, .release = proc_map_release, }; Which ends up going down to: show_map_vma(struct seq_file *m, struct vm_area_struct *vma, int is_pid) with that is_pid respectively true and false, but then show_map_vma() doesn't use 'is_pid' at all :-\ /me scratches head, what am I missing? The following commit, that appeared circa v4.9-rc2. So, fixing up the "tasks" -> "tasks" we end up with something safe and that avoids this by now non-existing problem, on older kernels, ok? commit b18cb64ead400c01bf1580eeba330ace51f8087d Author: Andy Lutomirski Date: Fri Sep 30 10:58:57 2016 -0700 fs/proc: Stop trying to report thread stacks This reverts more of: b76437579d13 ("procfs: mark thread stack correctly in proc//maps") ... which was partially reverted by: 65376df58217 ("proc: revert /proc//maps [stack:TID] annotation") Originally, /proc/PID/task/TID/maps was the same as /proc/TID/maps. In current kernels, /proc/PID/maps (or /proc/TID/maps even for threads) shows "[stack]" for VMAs in the mm's stack address range. In contrast, /proc/PID/task/TID/maps uses KSTK_ESP to guess the target thread's stack's VMA. This is racy, probably returns garbage and, on arches with CONFIG_TASK_INFO_IN_THREAD=y, is also crash-prone: KSTK_ESP is not safe to use on tasks that aren't known to be running ordinary process-context kernel code. This patch removes the difference and just shows "[stack]" for VMAs in the mm's stack range. This is IMO much more sensible -- the actual "stack" address really is treated specially by the VM code, and the current thread stack isn't even well-defined for programs that frequently switch stacks on their own. Reported-by: Jann Horn Signed-off-by: Andy Lutomirski Acked-by: Thomas Gleixner > > fp = fopen(filename, "r"); > if (fp == NULL) { > -- > 2.5.0