Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752143AbbHGJdI (ORCPT ); Fri, 7 Aug 2015 05:33:08 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:10015 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752015AbbHGJdF (ORCPT ); Fri, 7 Aug 2015 05:33:05 -0400 Message-ID: <55C47B38.60400@huawei.com> Date: Fri, 7 Aug 2015 17:32:40 +0800 From: xiakaixu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Alexei Starovoitov , Arnaldo Carvalho de Melo , , , , CC: "Wangnan (F)" , , Subject: [RFC] perf ebpf: The example that how to access hardware PMU counter in eBPF programs bu using perf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.101.23] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2786 Lines: 74 By combining PMU, kprobe and eBPF program together, many interesting things can be done. For example, by probing at sched:sched_switch we can measure IPC changing between different processes by watching 'cycle' PMU counter; by probing at entry and exit points of a kernel function we are able to compute cache miss rate for a function by collecting 'cache-misses' counter and see the differences. In summary, we can define the begin and end points of a procedure, insert kprobes on them, attach two BPF programs and let them collect specific PMU counter. Further, by reading those PMU counter BPF program can bring some hints to resource schedulers. I am focusing on the work that giving eBPF programs the new ability to access hardware PMU counter and using it from perf. In recent weeks I have submitted the kernel space code first and the latest V7 version is here (www.spinics.net/lists/netdev/msg338468.html). According to the design plan, we still need the perf side code. I will do it based on Wang Nan's patches (perf tools: filtering events using eBPF programs). Here is a simple eBPF program example that is loaded by using perf. It is just the basic design principle, and if OK, we will implement the perf side code refer to it. Waiting for your comments. Thanks. ==================================================================== struct bpf_map_def SEC("maps") my_cycles_map = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = 32, }; struct bpf_map_def SEC("maps") my_exception_map = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = 32, }; struct perf_event_map { struct bpf_map_def *map_def; char description[64]; }; struct perf_event_map SEC("perf_event_map") cycles = { .map_def = &my_cycles_map, .description = "cycles", }; struct perf_event_map SEC("perf_event_map") exception = { .map_def = &my_exception_map, .description = "exception", }; SEC("kprobe/sys_write") int bpf_prog(struct pt_regs *ctx) { u64 count_cycles, count_exception; u32 key = bpf_get_smp_processor_id(); char fmt[] = "CPU-%d cyc:%llu exc:%llu\n"; count_cycles = bpf_perf_event_read(&my_cycles_map, key); count_exception = bpf_perf_event_read(&my_exception_map, key); bpf_trace_printk(fmt, sizeof(fmt), key, count_cycles, count_exception); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/