Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753463Ab2JPKNp (ORCPT ); Tue, 16 Oct 2012 06:13:45 -0400 Received: from mail-la0-f46.google.com ([209.85.215.46]:62431 "EHLO mail-la0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751088Ab2JPKNn (ORCPT ); Tue, 16 Oct 2012 06:13:43 -0400 MIME-Version: 1.0 Date: Tue, 16 Oct 2012 12:13:41 +0200 Message-ID: Subject: [RFC] perf: need to expose sched_clock to correlate user samples with kernel samples From: Stephane Eranian To: LKML Cc: Peter Zijlstra , "mingo@elte.hu" , Paul Mackerras , Anton Blanchard , Will Deacon , "ak@linux.intel.com" , Pekka Enberg , Steven Rostedt , Robert Richter Content-Type: text/plain; charset=UTF-8 X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1903 Lines: 42 Hi, There are many situations where we want to correlate events happening at the user level with samples recorded in the perf_event kernel sampling buffer. For instance, we might want to correlate the call to a function or creation of a file with samples. Similarly, when we want to monitor a JVM with jitted code, we need to be able to correlate jitted code mappings with perf event samples for symbolization. Perf_events allows timestamping of samples with PERF_SAMPLE_TIME. That causes each PERF_RECORD_SAMPLE to include a timestamp generated by calling the local_clock() -> sched_clock_cpu() function. To make correlating user vs. kernel samples easy, we would need to access that sched_clock() functionality. However, none of the existing clock calls permit this at this point. They all return timestamps which are not using the same source and/or offset as sched_clock. I believe a similar issue exists with the ftrace subsystem. The problem needs to be adressed in a portable manner. Solutions based on reading TSC for the user level to reconstruct sched_clock() don't seem appropriate to me. One possibility to address this limitation would be to extend clock_gettime() with a new clock time, e.g., CLOCK_PERF. However, I understand that sched_clock_cpu() provides ordering guarantees only when invoked on the same CPU repeatedly, i.e., it's not globally synchronized. But we already have to deal with this problem when merging samples obtained from different CPU sampling buffer in per-thread mode. So this is not necessarily a showstopper. Alternatives could be to use uprobes but that's less practical to setup. Anyone with better ideas? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/