Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp1809495rwb; Sun, 14 Aug 2022 12:56:38 -0700 (PDT) X-Google-Smtp-Source: AA6agR4EoRvDetHuobOOKK922lYv6jytFYox0jeYdpVe3jfOOwXYNvslnpFhf/xbRIZ3cZuNraHw X-Received: by 2002:a17:907:3f11:b0:731:57ed:3aea with SMTP id hq17-20020a1709073f1100b0073157ed3aeamr8322050ejc.432.1660506998032; Sun, 14 Aug 2022 12:56:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660506998; cv=none; d=google.com; s=arc-20160816; b=y7USWa6H1GbTgFyBS6a/wWWoxiHwaJ+YXRgQptbwYmgF7fIyP3fZuYGpREj0dE7zex njTUE4VGvDUywnPdoFQoaxylFWnSTbLf0jYYSwdRv861FwgM7ELpXLf5kP5xC0huwVZ5 7trF13B7qeoP+JwSm+u0DmKmKHLesXH3W9yegrZ2VNPoEOEj94duuuSvPzBM/azoN3oh nmtqr5pA58bgkOV01mJ3XAk4Xv7pwAz3izIzHZHIFrgKKHG0XHSWdDiwFokSDJwgZbTe 5RZTxZTOVc73dZXoBZystjoAw3U/9tfPei3W6rrZgjU/lGAC16igtqi+5kMT4/lGC/LX pWMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=T1PgG3IEYKWjZugPauyZsln+an1Y02MSQBzXkZ+8BFQ=; b=KtUIXEwnxBNGdb+ZHf57GdUAptZ02EMeOVYumIdxqunLlzMDW8Q0iuYPanj5jWQ+a/ mu6c+POrWcDCjg6NJN4MdUlTdYshl7EaLnfqkiwzVMP3H6LktuxbGm8oHrzuVFqgZZ7X 5Vf9tyaRwOh0biH1YiI3yrB1sd078Km0IPP/x6O0DU/7Si50JhW55t5BciH2GI9DAGqK IoO+okwlYG1fRoDi2YWcZyVNWugIxfVCr9TwddVCT8T1807BjlQtv8mK+yJIIWjWhNwY U6+g6Azcnb0cxUV5V19F1BKDN0MPM9aQmkkNO5tFgMaIZ7x3K6tIs2V7n3plmFduszl4 kt8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dvoYFJbS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o26-20020a170906359a00b0072b13aacd47si5379552ejb.194.2022.08.14.12.56.05; Sun, 14 Aug 2022 12:56:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dvoYFJbS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231752AbiHNTjO (ORCPT + 99 others); Sun, 14 Aug 2022 15:39:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229563AbiHNTjN (ORCPT ); Sun, 14 Aug 2022 15:39:13 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7928F17A99 for ; Sun, 14 Aug 2022 12:39:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660505952; x=1692041952; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=X7rL4fg6h5O7O7uR51jC3e/S6X/ZtpZUor3z4Hln1r8=; b=dvoYFJbS0qq3t1x8IMiXwQnj1ki8XOZaCubB8zoiMqkxukbB3zIOReN/ 7xlMwmOUZq8iWf5F9sT0QJpOAqqAQVe1Yyev9AmVv5DlU+Su/TFrlf5DH Urp9/1XO4Aajo2G6KqKME0MXKpopcstdqmjlPHPE+8cw6inclK9ZzZa3g TwqnONff0HemWXsFIDfXHlgZRO3YWsG5CYY+UL79av4OX2E8eZTCp9DzS HPl+yf2TB8KNc/XpeSVWZBZ8T6mOEyOIOfJMtHQqDTMUgGQHs8zNHRSNy pqnnfHbkBWeQFVuNMxgSFPBO8yGPKx5AMSkMk+TY54Hd33M44twTVGaI1 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="293119332" X-IronPort-AV: E=Sophos;i="5.93,236,1654585200"; d="scan'208";a="293119332" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2022 12:39:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,236,1654585200"; d="scan'208";a="748716170" Received: from linux.intel.com ([10.54.29.200]) by fmsmga001.fm.intel.com with ESMTP; 14 Aug 2022 12:39:11 -0700 Received: from [10.252.214.243] (kliang2-mobl1.ccr.corp.intel.com [10.252.214.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by linux.intel.com (Postfix) with ESMTPS id 0BD97580C50; Sun, 14 Aug 2022 12:39:09 -0700 (PDT) Message-ID: <5e3e4723-bd3e-b5d2-cf9a-6d7334f93be8@linux.intel.com> Date: Sun, 14 Aug 2022 15:39:09 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.1.2 Subject: Re: [PATCH] perf/x86/intel/lbr: fix branch type encoding Content-Language: en-US To: Arnaldo Carvalho de Melo Cc: Stephane Eranian , linux-kernel@vger.kernel.org, peterz@infradead.org, kan.liang@intel.com, ak@linux.intel.com, acme@redhat.com, namhyung@kernel.org, irogers@google.com References: <20220810210656.2799243-1-eranian@google.com> <0267c94e-7989-ca92-4175-d820d1d63a0c@linux.intel.com> <48297c1e-6e44-53f1-da7d-4437ed87cf6f@linux.intel.com> <2fc8dc1d-6922-e2e0-8b5d-fad25ab12cbd@linux.intel.com> From: "Liang, Kan" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-08-12 3:35 p.m., Arnaldo Carvalho de Melo wrote: > Em Thu, Aug 11, 2022 at 11:56:56AM -0400, Liang, Kan escreveu: >> >> >> On 2022-08-11 11:33 a.m., Stephane Eranian wrote: >>> On Thu, Aug 11, 2022 at 6:28 PM Stephane Eranian wrote: >>>> >>>> On Thu, Aug 11, 2022 at 5:42 PM Liang, Kan wrote: >>>>> >>>>> >>>>> >>>>> On 2022-08-11 10:17 a.m., Stephane Eranian wrote: >>>>>> On Thu, Aug 11, 2022 at 3:23 PM Liang, Kan wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 2022-08-10 5:06 p.m., Stephane Eranian wrote: >>>>>>>> With architected LBR, the procesosr can record the type of each sampled taken >>>>>>>> branch. The type is encoded in 4-bit field in the LBR_INFO MSR of each entry. >>>>>>>> >>>>>>>> The branch type must then extracted and saved in the perf_branch_entry in the >>>>>>>> perf_events sampling buffer. With the current code, the raw Intel encoding of >>>>>>>> the branch is exported to user tools. >>>>>>> >>>>>>> In the intel_pmu_lbr_filter(), the raw encoding will be converted into >>>>>>> the X86_BR_* format via arch_lbr_br_type_map[]. Then the >>>>>>> common_branch_type() will convert the X86_BR_* format to the generic >>>>>>> PERF_BR_* type and expose to user tools. >>>>>>> >>>>>>> I double check the existing arch_lbr_br_type_map[] and branch_map[]. >>>>>>> They should generate the same PERF_BR_* type as your arch_lbr_type_map[]. >>>>>>> >>>>>>> Is there a test case which I can use to reproduce the problem? >>>>>>> >>>>>> I was doing a simple: >>>>>> $ perf record -b -e cpu/event=0xc4/ .... >>>>>> $ perf report -D >>>>>> Looking at the LBR information and the BR type, many entries has no branch type. >>>>>> What I see is a function where you do: e->type = get_lbr_br_type() and >>>>>> that is what >>>>>> is then saved in the buffer. Unless I am missing a later patch. >>>>>> >>>>> >>>>> To get the LBR type, the save_type filter option must be applied. See >>>>> 60f83fa6341d ("perf record: Create a new option save_type in >>>>> --branch-filter"). >>>>> >>>> That seems overly complicated. I don't recall having to pass a new option >>>> to get the LBR latency. It showed up automatically. So why for branch_type? >>>> >>>>> The -b only include the ANY option. Maybe we should extend the -b option >>>>> to ANY|SAVE_TYPE. >>>>> >>>> Ok, that explains it then. I think we need to simplify. >>>> >>> In fact, I don't see a case where you would not benefit from the branch type. >>> Furthermore, not having the branch type DOES NOT save any space in the >>> branch record (given we have a reserved field). So I think I prefer not having >>> to specify yet another cmdline option to get the branch type. >> >> >> I think the option is to avoid the overhead of disassembling of branch >> instruction. See eb0baf8a0d92 ("perf/core: Define the common branch type >> classification") >> "Since the disassembling of branch instruction needs some overhead, >> a new PERF_SAMPLE_BRANCH_TYPE_SAVE is introduced to indicate if it >> needs to disassemble the branch instruction and record the branch >> type." >> >> I have no idea how big the overhead is. If we can always be benefit from >> the branch type. I guess we can make it default on. > > Would you be so nice as to add a paragraph with such explanation to the > man page so that what happened to Stephane doesn't trip other people? > Sure, I will update the document once we decide whether enable the branch type by default in the kernel for Arch LBR. Thanks, Kan > - Arnaldo > >>> In fact, if you do >>> not pass the option, then perf report -D reports some bogus branch types, i.e., >>> not all entries have empty types. >> >> Yes, that's an issue for Arch LBR. If the overhead is not a problem, it >> should be gone after we make the SAVE_TYPE default. Otherwise, I think >> we need a patch to clear the fields if the SAVE_TYPE is not applied. >> >> Thanks, >> Kan >>> >>> >>>>>> >>>>>>> Thanks, >>>>>>> Kan >>>>>>> >>>>>>>> Yet tools, such as perf, expected the >>>>>>>> branch type to be encoded using perf_events branch type enum >>>>>>>> (see tools/perf/util/branch.c). As a result of the discrepancy, the output of >>>>>>>> perf report -D shows bogus branch types. >>>>>>>> >>>>>>>> Fix the problem by converting the Intel raw encoding into the perf_events >>>>>>>> branch type enum values. With that in place and with no changes to the tools, >>>>>>>> the branch types are now reported properly. >>>>>>>> >>>>>>>> Signed-off-by: Stephane Eranian >>>>>>>> --- >>>>>>>> arch/x86/events/intel/lbr.c | 35 ++++++++++++++++++++++++++++++++--- >>>>>>>> 1 file changed, 32 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c >>>>>>>> index 4f70fb6c2c1e..ef63d4d46b50 100644 >>>>>>>> --- a/arch/x86/events/intel/lbr.c >>>>>>>> +++ b/arch/x86/events/intel/lbr.c >>>>>>>> @@ -894,9 +894,23 @@ static DEFINE_STATIC_KEY_FALSE(x86_lbr_mispred); >>>>>>>> static DEFINE_STATIC_KEY_FALSE(x86_lbr_cycles); >>>>>>>> static DEFINE_STATIC_KEY_FALSE(x86_lbr_type); >>>>>>>> >>>>>>>> -static __always_inline int get_lbr_br_type(u64 info) >>>>>>>> +/* >>>>>>>> + * Array index encodes IA32_LBR_x_INFO Branch Type Encodings >>>>>>>> + * as per Intel SDM Vol3b Branch Types section >>>>>>>> + */ >>>>>>>> +static const int arch_lbr_type_map[]={ >>>>>>>> + [0] = PERF_BR_COND, >>>>>>>> + [1] = PERF_BR_IND, >>>>>>>> + [2] = PERF_BR_UNCOND, >>>>>>>> + [3] = PERF_BR_IND_CALL, >>>>>>>> + [4] = PERF_BR_CALL, >>>>>>>> + [5] = PERF_BR_RET, >>>>>>>> +}; >>>>>>>> +#define ARCH_LBR_TYPE_COUNT ARRAY_SIZE(arch_lbr_type_map) >>>>>>>> + >>>>>>>> +static __always_inline u16 get_lbr_br_type(u64 info) >>>>>>>> { >>>>>>>> - int type = 0; >>>>>>>> + u16 type = 0; >>>>>>>> >>>>>>>> if (static_branch_likely(&x86_lbr_type)) >>>>>>>> type = (info & LBR_INFO_BR_TYPE) >> LBR_INFO_BR_TYPE_OFFSET; >>>>>>>> @@ -904,6 +918,21 @@ static __always_inline int get_lbr_br_type(u64 info) >>>>>>>> return type; >>>>>>>> } >>>>>>>> >>>>>>>> +/* >>>>>>>> + * The kernel cannot expose raw Intel branch type encodings because they are >>>>>>>> + * not generic. Instead, the function below maps the encoding to the >>>>>>>> + * perf_events user visible branch types. >>>>>>>> + */ >>>>>>>> +static __always_inline int get_lbr_br_type_mapping(u64 info) >>>>>>>> +{ >>>>>>>> + if (static_branch_likely(&x86_lbr_type)) { >>>>>>>> + u16 raw_type = get_lbr_br_type(info); >>>>>>>> + if (raw_type < ARCH_LBR_TYPE_COUNT) >>>>>>>> + return arch_lbr_type_map[raw_type]; >>>>>>>> + } >>>>>>>> + return PERF_BR_UNKNOWN; >>>>>>>> +} >>>>>>>> + >>>>>>>> static __always_inline bool get_lbr_mispred(u64 info) >>>>>>>> { >>>>>>>> bool mispred = 0; >>>>>>>> @@ -957,7 +986,7 @@ static void intel_pmu_store_lbr(struct cpu_hw_events *cpuc, >>>>>>>> e->in_tx = !!(info & LBR_INFO_IN_TX); >>>>>>>> e->abort = !!(info & LBR_INFO_ABORT); >>>>>>>> e->cycles = get_lbr_cycles(info); >>>>>>>> - e->type = get_lbr_br_type(info); >>>>>>>> + e->type = get_lbr_br_type_mapping(info); >>>>>>>> } >>>>>>>> >>>>>>>> cpuc->lbr_stack.nr = i; >