Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754157AbbFKWIZ (ORCPT ); Thu, 11 Jun 2015 18:08:25 -0400 Received: from mail-qc0-f174.google.com ([209.85.216.174]:36609 "EHLO mail-qc0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752207AbbFKWIX (ORCPT ); Thu, 11 Jun 2015 18:08:23 -0400 Message-ID: <557A06D7.7030304@plumgrid.com> Date: Thu, 11 Jun 2015 15:08:23 -0700 From: Alexei Starovoitov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Daniel Wagner , Steven Rostedt , Tom Zanussi CC: "linux-kernel@vger.kernel.org" , Wang Nan Subject: Re: latency histogram with BPF References: <557937D8.90606@bmw-carit.de> In-Reply-To: <557937D8.90606@bmw-carit.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4449 Lines: 97 On 6/11/15 12:25 AM, Daniel Wagner wrote: > > In both cases BPF or based on Tom's 'hist' triggers' patches, there is > some trickery necessary to get it working. While the first approach > has more flexibility what you want to measure or how you want to > present it, I suspect it will be harder to get it working/accepted. why? Out of your patch only 2 lines are for kernel, which I wanted to add anyway. The sample code also looks good. > Obviously, adding kprobes to trace_preempt_[on|off] is > asking for problems: thanks for reporting. I reproduced it with hackbench, but my stack trace looks completely different. Looks like some memory corruption is happening. Not clear where. I'm still debugging. > CPU 3 > latency : count distribution > 1 -> 1 : 0 | | > 2 -> 3 : 0 | | > 4 -> 7 : 0 | | > 8 -> 15 : 25182 |************************************************* | > 16 -> 31 : 1675 |** | > 32 -> 63 : 0 | | > 64 -> 127 : 0 | | > 128 -> 255 : 0 | | > 256 -> 511 : 0 | | > 512 -> 1023 : 0 | | > > The numbers look a bit too nice and low. I suspect something is going > wrong. I see similar numbers. 25k over 5 sec = 5k preempt on/off per second which sounds about right for idle. > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c > index 2d56ce5..e4d3e76 100644 > --- a/kernel/trace/bpf_trace.c > +++ b/kernel/trace/bpf_trace.c > @@ -172,6 +172,8 @@ static const struct bpf_func_proto *kprobe_prog_func_proto(enum bpf_func_id func > return &bpf_probe_read_proto; > case BPF_FUNC_ktime_get_ns: > return &bpf_ktime_get_ns_proto; > + case BPF_FUNC_get_smp_processor_id: > + return &bpf_get_smp_processor_id_proto; This part I'll take into my set of patches. > --- a/samples/bpf/bpf_helpers.h > +++ b/samples/bpf/bpf_helpers.h > @@ -21,6 +21,8 @@ static unsigned long long (*bpf_ktime_get_ns)(void) = > (void *) BPF_FUNC_ktime_get_ns; > static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) = > (void *) BPF_FUNC_trace_printk; > +static unsigned int (*bpf_get_smp_processor_id)(void) = > + (void *) BPF_FUNC_get_smp_processor_id; this part is already there in net-next. > diff --git a/samples/bpf/hwlathist_kern.c b/samples/bpf/hwlathist_kern.c > new file mode 100644 > index 0000000..7a97e7e > --- /dev/null > +++ b/samples/bpf/hwlathist_kern.c looks good and useful, but I would like to find and fix the bug first before exposing this bit as a 'sample', since right now it's 'crash me if hackbench is running' test :) If you have any suggestions on where to look, I'm all ears. My stack traces look like: Running with 10*40 (== 400) tasks. [ 12.032571] kernel BUG at ../mm/slub.c:3413! [ 12.036004] [] rcu_process_callbacks+0x283/0x680 [ 12.036004] [] __do_softirq+0xf0/0x2b0 [ 12.036004] [] ? kthread+0xc9/0xe0 [ 12.036004] [] run_ksoftirqd+0x21/0x50 or [ 25.788074] kernel tried to execute NX-protected page - exploit attempt? (uid: 0) [ 25.788801] BUG: unable to handle kernel paging request at ffff88000b1ea2a0 [ 25.800085] [] ? rcu_process_callbacks+0x22a/0x680 [ 25.800085] [] __do_softirq+0xf0/0x2b0 [ 25.800085] [] do_softirq_own_stack+0x1c/0x30 [ 25.800085] [ 25.800085] [] do_softirq+0x3d/0x40 [ 25.800085] [] __local_bh_enable_ip+0x9a/0xb0 [ 25.800085] [] _raw_spin_unlock_bh+0x1b/0x20 [ 25.800085] [] bdi_writeback_workfn+0x1b2/0x470 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/