Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932926AbbKRIzC (ORCPT ); Wed, 18 Nov 2015 03:55:02 -0500 Received: from szxga02-in.huawei.com ([119.145.14.65]:32608 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932697AbbKRIy4 (ORCPT ); Wed, 18 Nov 2015 03:54:56 -0500 Message-ID: <564C3BAA.4040806@huawei.com> Date: Wed, 18 Nov 2015 16:49:46 +0800 From: "Wangnan (F)" User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Ingo Molnar CC: Jiri Olsa , Arnaldo Carvalho de Melo , David Ahern , Peter Zijlstra , Milian Wolff , , pi3orama , lizefan 00213767 Subject: Re: [BUG REPORT] perf tools: x86_64: Broken calllchain when sampling taken at 'callq' instruction References: <564C26C4.2040603@huawei.com> <564C3011.8090002@huawei.com> <20151118082033.GA24726@gmail.com> <564C3A0E.3030502@huawei.com> In-Reply-To: <564C3A0E.3030502@huawei.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.111.66.109] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020201.564C3CD1.00BE,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 24490190417df135d703d58f4e3d4694 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3247 Lines: 110 On 2015/11/18 16:42, Wangnan (F) wrote: > > > On 2015/11/18 16:20, Ingo Molnar wrote: >> * Wangnan (F) wrote: >> >>> On 2015/11/18 15:20, Wangnan (F) wrote: >>>> Hi all, >>>> >>>> When analysising Jiri's patchset [1] I found a dwarf unwind problem. >>>> On x86 platform, when sample is at a 'callq' instruction, dwarf based >>>> stack unwind always fail. >>>> >>>> I compile a small C source file with debug information, turn off >>>> frame pointer and disable optimization: >>>> >>>> $ gcc -g -O0 -fomit-frame-pointer ./test_dwarf_unwind.c -o >>>> ./test_dwarf_unwind >>> For whom want to test it: here is the test code I used. >>> >>> #include >>> #include >>> #include >>> >>> static volatile int x = 0; >>> >>> int funcc(void) >>> { >>> struct timeval tv1, tv2; >>> unsigned long us1, us2; >>> >>> gettimeofday(&tv1, NULL); >>> >>> us1 = tv1.tv_sec * 1000000 + tv1.tv_usec; >>> >>> while(1) { >>> x = x + 100; >>> gettimeofday(&tv2, NULL); >>> us2 = tv2.tv_sec * 1000000 + tv2.tv_usec; >>> if (us2 - us1 >= 3000000) >>> break; >>> } >>> return x; >>> } >>> int funcb(void) { return funcc();} >>> int funca(void) { return funcb();} >>> int main() { funca(); return 0;} >> What CPU model is this, and what event was used - PEBS perhaps? This >> might be some >> sort of PMU sampling bug/quirk/misfeature - or perhaps a kernel side >> fixup that >> went bad? > > $ cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 60 > model name : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz > stepping : 3 > microcode : 0x1c > cpu MHz : 3600.000 > cache size : 8192 KB > physical id : 0 > siblings : 8 > core id : 0 > cpu cores : 4 > apicid : 0 > initial apicid : 0 > fpu : yes > fpu_exception : yes > cpuid level : 13 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe > syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts > rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq > dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm > pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave > avx f16c rdrand lahf_lm abm ida arat epb pln pts dtherm tpr_shadow > vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 > erms invpcid xsaveopt > bugs : > bogomips : 7183.88 > clflush size : 64 > cache_alignment : 64 > address sizes : 39 bits physical, 48 bits virtual > power management: > > > perf cmdline is > > # ./pref record -g -F 9 --call-graph dwarf ./test_dwarf_unwind > > Use default events, precise_ip == 2 so uses PEBS. > Testetd 'cycles', 'cycles:p' and 'cycles:pp'. Only 'cycles:pp' captures sample at callq. So maybe a PEBS problem? Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/