Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754816Ab2JXLw2 (ORCPT ); Wed, 24 Oct 2012 07:52:28 -0400 Received: from mga09.intel.com ([134.134.136.24]:42471 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752631Ab2JXLw1 (ORCPT ); Wed, 24 Oct 2012 07:52:27 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.80,639,1344236400"; d="scan'208";a="209938410" Message-ID: <5087D678.1080905@intel.com> Date: Wed, 24 Oct 2012 19:52:24 +0800 From: "Yan, Zheng" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121016 Thunderbird/16.0.1 MIME-Version: 1.0 To: Stephane Eranian CC: LKML , Peter Zijlstra , "ak@linux.intel.com" Subject: Re: [PATCH V2 6/7] perf, x86: Use LBR call stack to get user callchain References: <1351058350-9159-1-git-send-email-zheng.z.yan@intel.com> <1351058350-9159-7-git-send-email-zheng.z.yan@intel.com> <5087CF96.7010406@intel.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5785 Lines: 126 On 10/24/2012 07:47 PM, Stephane Eranian wrote: > On Wed, Oct 24, 2012 at 1:23 PM, Yan, Zheng wrote: >> On 10/24/2012 04:57 PM, Stephane Eranian wrote: >>> On Wed, Oct 24, 2012 at 7:59 AM, Yan, Zheng wrote: >>>> From: "Yan, Zheng" >>>> >>>> Try enabling the LBR call stack feature if event request recording >>>> callchain. Try utilizing the LBR call stack to get user callchain >>>> in case of there is no frame pointer. >>>> >>>> Signed-off-by: Yan, Zheng >>>> --- >>>> arch/x86/kernel/cpu/perf_event.c | 126 +++++++++++++++++++++-------- >>>> arch/x86/kernel/cpu/perf_event.h | 7 ++ >>>> arch/x86/kernel/cpu/perf_event_intel.c | 20 ++--- >>>> arch/x86/kernel/cpu/perf_event_intel_lbr.c | 3 + >>>> include/linux/perf_event.h | 6 ++ >>>> kernel/events/core.c | 11 ++- >>>> 6 files changed, 124 insertions(+), 49 deletions(-) >>>> >>>> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c >>>> index 8ae8044..3bf2100 100644 >>>> --- a/arch/x86/kernel/cpu/perf_event.c >>>> +++ b/arch/x86/kernel/cpu/perf_event.c >>>> @@ -398,35 +398,46 @@ int x86_pmu_hw_config(struct perf_event *event) >>>> >>>> if (event->attr.precise_ip > precise) >>>> return -EOPNOTSUPP; >>>> - /* >>>> - * check that PEBS LBR correction does not conflict with >>>> - * whatever the user is asking with attr->branch_sample_type >>>> - */ >>>> - if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format < 2) { >>>> - u64 *br_type = &event->attr.branch_sample_type; >>>> - >>>> - if (has_branch_stack(event)) { >>>> - if (!precise_br_compat(event)) >>>> - return -EOPNOTSUPP; >>>> - >>>> - /* branch_sample_type is compatible */ >>>> - >>>> - } else { >>>> - /* >>>> - * user did not specify branch_sample_type >>>> - * >>>> - * For PEBS fixups, we capture all >>>> - * the branches at the priv level of the >>>> - * event. >>>> - */ >>>> - *br_type = PERF_SAMPLE_BRANCH_ANY; >>>> - >>>> - if (!event->attr.exclude_user) >>>> - *br_type |= PERF_SAMPLE_BRANCH_USER; >>>> - >>>> - if (!event->attr.exclude_kernel) >>>> - *br_type |= PERF_SAMPLE_BRANCH_KERNEL; >>>> - } >>>> + } >>>> + /* >>>> + * check that PEBS LBR correction does not conflict with >>>> + * whatever the user is asking with attr->branch_sample_type >>>> + */ >>>> + if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format < 2) { >>>> + u64 *br_type = &event->attr.branch_sample_type; >>>> + >>>> + if (has_branch_stack(event)) { >>>> + if (!precise_br_compat(event)) >>>> + return -EOPNOTSUPP; >>>> + >>>> + /* branch_sample_type is compatible */ >>>> + >>>> + } else { >>>> + /* >>>> + * user did not specify branch_sample_type >>>> + * >>>> + * For PEBS fixups, we capture all >>>> + * the branches at the priv level of the >>>> + * event. >>>> + */ >>>> + *br_type = PERF_SAMPLE_BRANCH_ANY; >>>> + >>>> + if (!event->attr.exclude_user) >>>> + *br_type |= PERF_SAMPLE_BRANCH_USER; >>>> + >>>> + if (!event->attr.exclude_kernel) >>>> + *br_type |= PERF_SAMPLE_BRANCH_KERNEL; >>>> + } >>>> + } else if (event->attr.sample_type & PERF_SAMPLE_CALLCHAIN) { >>>> + if (!has_branch_stack(event) && x86_pmu.attr_lbr_callstack) { >>>> + /* >>>> + * user did not specify branch_sample_type, >>>> + * try using the LBR call stack facility to >>>> + * record call chains in the user space. >>>> + */ >>>> + event->attr.branch_sample_type = >>>> + PERF_SAMPLE_BRANCH_USER | >>>> + PERF_SAMPLE_BRANCH_CALL_STACK; >>> >>> You are forcing user level here, but how do you know the user wanted >>> ONLY user level >>> callchains? >>> >>> >> >> The LBR call stack is used only when the frame pointer approach doesn't work. > > And where is that determination made? check code that is added to perf_callchain_user and perf_callchain_user32 > >> I think the kernel has frame pointer for the most cases. The second reason is >> that the LBR call stack only has 16 entries. I think it's too small to record >> both kernel and user call chains. >> > It's even too small for many object oriented user programs as well. > >> Regards >> Yan, Zheng >> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/