Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp7155389ybi; Mon, 8 Jul 2019 15:46:03 -0700 (PDT) X-Google-Smtp-Source: APXvYqycmhkQTcruQFEvHrTSfHjWtZl9MUSKI7SsdTWV/Z44zxDre9vsMZZMtScz6575LZty/QVe X-Received: by 2002:a63:6286:: with SMTP id w128mr27076562pgb.12.1562625963521; Mon, 08 Jul 2019 15:46:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562625963; cv=none; d=google.com; s=arc-20160816; b=apWtULnGP/DPc1DTLhGE1t50t3ReJe71yM8MzgQvoxCMcNIZvD2eXgEDzHXrrBrZxK HGrUFXMDGU5qk2/z+QGaGjV5tz8jm9IvGAc1quyzQuc6EbzyUJ6Zg4GpJOtpCdC4wE5q EGxkI/UmeDpNRssanTO8+G5Uz99ZHYrwHcMfSJsWha2C42aiYCYmqxz0td+tY9qSNbMG M62ZmMhnExMW5XxAbgA0ZgHuN3rUvbZ8CTDcjS5Hqv6C0xBYcEL1H8Nnr3eLgPTrPdw5 axtvogXO2xH0dyzK98IIobtln4hP3poO56HGDFR+waR9rIjBdJWTVLSUdJqzGBb+b8Yq 6Z6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=E09i/YUwdjXE+zo+DlBVwd0NYopalsR3eUNo0Jmptng=; b=Mk1nFUiqq7cyxVWn/r0CnVBTo3hBHbOuHbVwic6x0shHDQCWTSY7VUzB/JvI2oXNwI Y3Enbd7AvjCiW/y2XHbDoyGeDIrtKUwlJW75fVTU9XiQOEkb5iyYRhGxj3FfXxkrOOp9 V9uWLbiNMN3kpeKB1ag/uqBT84FYzLzxwb0j8vZPpQB3C6xh5RHW9DDMGYKYUVL2Nem5 qkDpmnxLSlvbVMsOPu/ejUbkFKiavwd2RxIeW6/bLx8Z2M3IpLo53QpgrngFx/G6XifM BFXVwBD670TLYp4lg0Rb1y7i92cp0mSkRV8D+yKoDo6D4nrO7Pskf2HByEC3V6ntQQHz tNBg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Vgimgcdv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k15si20812984pga.99.2019.07.08.15.45.48; Mon, 08 Jul 2019 15:46:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Vgimgcdv; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391635AbfGHReM (ORCPT + 99 others); Mon, 8 Jul 2019 13:34:12 -0400 Received: from mail-io1-f68.google.com ([209.85.166.68]:41423 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388499AbfGHReL (ORCPT ); Mon, 8 Jul 2019 13:34:11 -0400 Received: by mail-io1-f68.google.com with SMTP id j5so17744794ioj.8 for ; Mon, 08 Jul 2019 10:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=E09i/YUwdjXE+zo+DlBVwd0NYopalsR3eUNo0Jmptng=; b=VgimgcdvvuaagysekxqyiAFNn6FzQEgkoEVZR6OnGOhGuHCDV3YPYViEL8XFMnFO30 34dU9OQXaAiYdl0GJZGXDj7KCI/cC16xjW6duojoMXWMPxbb3OLrgmDyHg8OM4GO8fyo cHZpir4Q4OMZMfqKSIaIWtXU0TCeLIZnCII8JOO4yiQ4/yG/7p+N0kVjxjeTB6eayIDq mq5vmZaivs4zREBp4jV3fDnvgiXgD5Ug75N8g5dbWmUTp4c96uqeEZhwHWfH9Sg1/H7D Jc1KkAckFKRk6amhbaqu1uz/Xji3ykKJ2frePRYLM0V6/GJL2X02zbh8lzpG5kFA/IX5 329w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=E09i/YUwdjXE+zo+DlBVwd0NYopalsR3eUNo0Jmptng=; b=OE7BS9Q+hkxsseLzhd38dowF20WIo8A24Pjkgne8N7bK9KSSTbzWmrKKHSQDxnwA2+ uxE9gqrw0f+ElJq8WSrvNGgSg+ijl4goinQ7CAF4rRbHPCfbNgYTHsWa9MMy+DF4/nb3 lE0GYc3aoT/tfKY4NEIEgZycJGWicszM5RvvZuLaMBHKkYo41O8FKMFjTH0YiLXquD4m xR+8QoFZXkb+tdAdnYIvxpsHVc1OBwadomkKvWwoFPN+0OwBh8gSvdFE+LO1peHWNgF9 rLWOI4AJN4t0Ef7KkVYxiPOybUqHfWhvKIh0pRjuOHN2rUL8U2YPBywcnGz2F43/BpQN QteA== X-Gm-Message-State: APjAAAW0bftbeuqmZVIYM3Z6NbErdFUoRwBcfDZndN3x6MBVXtVUiiNb 0rWTEsSmT9Lbb2z5PniU17vI5+64wnTwN9Vdbp1t9w== X-Received: by 2002:a5d:8e08:: with SMTP id e8mr2766517iod.139.1562607250441; Mon, 08 Jul 2019 10:34:10 -0700 (PDT) MIME-Version: 1.0 References: <20190620034446.25561-1-leo.yan@linaro.org> In-Reply-To: <20190620034446.25561-1-leo.yan@linaro.org> From: Mathieu Poirier Date: Mon, 8 Jul 2019 11:33:59 -0600 Message-ID: Subject: Re: [PATCH v3] perf cs-etm: Improve completeness for kernel address space To: Leo Yan Cc: Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Suzuki K Poulose , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Linux Kernel Mailing List , linux-arm-kernel , netdev@vger.kernel.org, bpf@vger.kernel.org, Peter Zijlstra , Coresight ML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 19 Jun 2019 at 21:45, Leo Yan wrote: > > Arm and arm64 architecture reserve some memory regions prior to the > symbol '_stext' and these memory regions later will be used by device > module and BPF jit. The current code misses to consider these memory > regions thus any address in the regions will be taken as user space > mode, but perf cannot find the corresponding dso with the wrong CPU > mode so we misses to generate samples for device module and BPF > related trace data. > > This patch parse the link scripts to get the memory size prior to start > address and reduce this size from 'etmq->etm->kernel_start', then can > get a fixed up kernel start address which contain memory regions for > device module and BPF. Finally, cs_etm__cpu_mode() can return right > mode for these memory regions and perf can successfully generate > samples. > > The reason for parsing the link scripts is Arm architecture changes text > offset dependent on different platforms, which define multiple text > offsets in $kernel/arch/arm/Makefile. This offset is decided when build > kernel and the final value is extended in the link script, so we can > extract the used value from the link script. We use the same way to > parse arm64 link script as well. If fail to find the link script, the > pre start memory size is assumed as zero, in this case it has no any > change caused with this patch. > > Below is detailed info for testing this patch: > > - Build LLVM/Clang 8.0 or later version; > > - Configure perf with ~/.perfconfig: > > root@debian:~# cat ~/.perfconfig > # this file is auto-generated. > [llvm] > clang-path =3D /mnt/build/llvm-build/build/install/bin/clang > kbuild-dir =3D /mnt/linux-kernel/linux-cs-dev/ > clang-opt =3D "-g" > dump-obj =3D true > > [trace] > show_zeros =3D yes > show_duration =3D no > no_inherit =3D yes > show_timestamp =3D no > show_arg_names =3D no > args_alignment =3D 40 > show_prefix =3D yes > > - Run 'perf trace' command with eBPF event: > > root@debian:~# perf trace -e string \ > -e $kernel/tools/perf/examples/bpf/augmented_raw_syscalls.c > > - Read eBPF program memory mapping in kernel: > > root@debian:~# echo 1 > /proc/sys/net/core/bpf_jit_kallsyms > root@debian:~# cat /proc/kallsyms | grep -E "bpf_prog_.+_sys_[enter|exi= t]" > ffff000000086a84 t bpf_prog_f173133dc38ccf87_sys_enter [bpf] > ffff000000088618 t bpf_prog_c1bd85c092d6e4aa_sys_exit [bpf] > > - Launch any program which accesses file system frequently so can hit > the system calls trace flow with eBPF event; > > - Capture CoreSight trace data with filtering eBPF program: > > root@debian:~# perf record -e cs_etm/@20070000.etr/ \ > --filter 'filter 0xffff000000086a84/0x800' -a sleep 5s > > - Annotate for symbol 'bpf_prog_f173133dc38ccf87_sys_enter': > > root@debian:~# perf report > Then select 'branches' samples and press 'a' to annotate symbol > 'bpf_prog_f173133dc38ccf87_sys_enter', press 'P' to print to the > bpf_prog_f173133dc38ccf87_sys_enter.annotation file: > > root@debian:~# cat bpf_prog_f173133dc38ccf87_sys_enter.annotation > > bpf_prog_f173133dc38ccf87_sys_enter() bpf_prog_f173133dc38ccf87_sys_ent= er > Event: branches > > Percent int sys_enter(struct syscall_enter_args *args) > stp x29, x30, [sp, #-16]! > > int key =3D 0; > mov x29, sp > > augmented_args =3D bpf_map_lookup_elem(&augmented_= filename_map, &key); > stp x19, x20, [sp, #-16]! > > augmented_args =3D bpf_map_lookup_elem(&augmented_= filename_map, &key); > stp x21, x22, [sp, #-16]! > > stp x25, x26, [sp, #-16]! > > return bpf_get_current_pid_tgid(); > mov x25, sp > > return bpf_get_current_pid_tgid(); > mov x26, #0x0 // #0 > > sub sp, sp, #0x10 > > return bpf_map_lookup_elem(pids, &pid) !=3D NULL; > add x19, x0, #0x0 > > mov x0, #0x0 // #0 > > mov x10, #0xfffffffffffffff8 // #-8 > > if (pid_filter__has(&pids_filtered, getpid())) > str w0, [x25, x10] > > probe_read(&augmented_args->args, sizeof(augmented_args->= args), args); > add x1, x25, #0x0 > > probe_read(&augmented_args->args, sizeof(augmented_args->= args), args); > mov x10, #0xfffffffffffffff8 // #-8 > > syscall =3D bpf_map_lookup_elem(&syscalls, &augmented_arg= s->args.syscall_nr); > add x1, x1, x10 > > syscall =3D bpf_map_lookup_elem(&syscalls, &augmented_arg= s->args.syscall_nr); > mov x0, #0xffff8009ffffffff // #-140694538682= 369 > > movk x0, #0x6698, lsl #16 > > movk x0, #0x3e00 > > mov x10, #0xffffffffffff1040 // #-61376 > > if (syscall =3D=3D NULL || !syscall->enabled) > movk x10, #0x1023, lsl #16 > > if (syscall =3D=3D NULL || !syscall->enabled) > movk x10, #0x0, lsl #32 > > loop_iter_first() > 3.69 =E2=86=92 blr bpf_prog_f173133dc38ccf87_sys_enter > loop_iter_first() > add x7, x0, #0x0 > > loop_iter_first() > add x20, x7, #0x0 > > int size =3D probe_read_str(&augmented_filename->value, f= ilename_len, filename_arg); > mov x0, #0x1 // #1 I'm not sure all this information about annotation should be in the changelog. This patch is about being able to decode traces that executed outside the current kernel addresse range and as such simply using "perf report" or "perf script" successfully is enough to test this set. Any information that goes beyond that muddies the water. > > [...] > > Cc: Mathieu Poirier > Cc: Alexander Shishkin > Cc: Jiri Olsa > Cc: Namhyung Kim > Cc: Peter Zijlstra > Cc: Suzuki Poulouse > Cc: coresight@lists.linaro.org > Cc: linux-arm-kernel@lists.infradead.org > Signed-off-by: Leo Yan > --- > tools/perf/Makefile.config | 22 ++++++++++++++++++++++ > tools/perf/util/cs-etm.c | 19 ++++++++++++++++++- > 2 files changed, 40 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config > index 51dd00f65709..a58cd5a43a98 100644 > --- a/tools/perf/Makefile.config > +++ b/tools/perf/Makefile.config > @@ -418,6 +418,28 @@ ifdef CORESIGHT > endif > LDFLAGS +=3D $(LIBOPENCSD_LDFLAGS) > EXTLIBS +=3D $(OPENCSDLIBS) > + PRE_START_SIZE :=3D 0 > + ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),) > + ifeq ($(SRCARCH),arm64) > + # Extract info from lds: > + # . =3D ((((((((0xffffffffffffffff)) - (((1)) << (48)) + 1) + (= 0)) + (0x08000000))) + (0x08000000))) + 0x00080000; > + # PRE_START_SIZE :=3D (0x08000000 + 0x08000000 + 0x00080000) =3D= 0x10080000 > + PRE_START_SIZE :=3D $(shell egrep ' \. \=3D \({8}0x[0-9a-fA-F]+\= ){2}' \ > + $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \ > + sed -e 's/[(|)|.|=3D|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*= //' | \ > + awk -F' ' '{printf "0x%x", $$6+$$7+$$8}' 2>/dev/null) > + endif > + ifeq ($(SRCARCH),arm) > + # Extract info from lds: > + # . =3D ((0xC0000000)) + 0x00208000; > + # PRE_START_SIZE :=3D 0x00208000 > + PRE_START_SIZE :=3D $(shell egrep ' \. \=3D \({2}0x[0-9a-fA-F]+\= ){2}' \ > + $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \ > + sed -e 's/[(|)|.|=3D|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*= //' | \ > + awk -F' ' '{printf "0x%x", $$2}' 2>/dev/null) > + endif > + endif > + CFLAGS +=3D -DARM_PRE_START_SIZE=3D$(PRE_START_SIZE) It might be useful to do this for arm and arm64 regardless of CoreSight but I'll let Arnaldo decide on this. > $(call detected,CONFIG_LIBOPENCSD) > ifdef CSTRACE_RAW > CFLAGS +=3D -DCS_DEBUG_RAW > diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c > index 0c7776b51045..5fa0be3a3904 100644 > --- a/tools/perf/util/cs-etm.c > +++ b/tools/perf/util/cs-etm.c > @@ -613,10 +613,27 @@ static void cs_etm__free(struct perf_session *sessi= on) > static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address) > { > struct machine *machine; > + u64 fixup_kernel_start =3D 0; > > machine =3D etmq->etm->machine; > > - if (address >=3D etmq->etm->kernel_start) { > + /* > + * Since arm and arm64 specify some memory regions prior to > + * 'kernel_start', kernel addresses can be less than 'kernel_star= t'. > + * > + * For arm architecture, the 16MB virtual memory space prior to > + * 'kernel_start' is allocated to device modules, a PMD table if > + * CONFIG_HIGHMEM is enabled and a PGD table. > + * > + * For arm64 architecture, the root PGD table, device module memo= ry > + * region and BPF jit region are prior to 'kernel_start'. > + * > + * To reflect the complete kernel address space, compensate these > + * pre-defined regions for kernel start address. > + */ > + fixup_kernel_start =3D etmq->etm->kernel_start - ARM_PRE_START_SI= ZE; > + > + if (address >=3D fixup_kernel_start) { > if (machine__is_host(machine)) > return PERF_RECORD_MISC_KERNEL; > else Tested-by: Mathieu Poirier > -- > 2.17.1 >