Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751208AbdCOG5d (ORCPT ); Wed, 15 Mar 2017 02:57:33 -0400 Received: from mail-pf0-f170.google.com ([209.85.192.170]:36850 "EHLO mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750815AbdCOG5c (ORCPT ); Wed, 15 Mar 2017 02:57:32 -0400 From: Stephane Eranian To: linux-kernel@vger.kernel.org Cc: acme@redhat.com, peterz@infradead.org, mingo@elte.hu, jolsa@redhat.com, namhyung.kim@kernel.org Subject: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale Date: Tue, 14 Mar 2017 23:57:21 -0700 Message-Id: <1489561041-19778-1-git-send-email-eranian@google.com> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1521 Lines: 37 This patch significantly improves the execution time of perf_event__synthesize_mmap_events() when running perf record on systems where processes have lots of threads. It just happens that cat /proc/pid/maps support uses a O(N^2) algorithm to generate each map line in the maps file. If you have 1000 threads, then you have necessarily 1000 stacks. For each vma, you need to check if it corresponds to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn. As of today, perf does not use the fact that a mapping is a stack, therefore we can work around the issue by using /proc/pid/tasks/pid/maps. This entry does not try to map a vma to stack and is thus much faster with no loss of functonality. The proc-map-timeout logic is kept in case user still want some uppre limit. Signed-off-by: Stephane Eranian --- tools/perf/util/event.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c index 4ea7ce7..b137566 100644 --- a/tools/perf/util/event.c +++ b/tools/perf/util/event.c @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool, if (machine__is_default_guest(machine)) return 0; - snprintf(filename, sizeof(filename), "%s/proc/%d/maps", - machine->root_dir, pid); + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps", + machine->root_dir, pid, pid); fp = fopen(filename, "r"); if (fp == NULL) { -- 2.5.0