Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp775354pxb; Fri, 22 Apr 2022 10:53:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDJXL9kw/BCz3usfMsDBl0NeaZh4Chu7a4VbAidr+dynWmlzZ8wjJVMep6b+kiYhS6KUeR X-Received: by 2002:a05:6a00:170d:b0:50a:858e:2b6e with SMTP id h13-20020a056a00170d00b0050a858e2b6emr6160499pfc.48.1650650037699; Fri, 22 Apr 2022 10:53:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650650037; cv=none; d=google.com; s=arc-20160816; b=s1JDrOBV3gb1+W06sYpJbDIbo3SrCz1k12kHMtS4YEJKabW/7u4Yg2TkFAHL+x2yFY 5R5ENZiuxyAy0c5pWirbYWLE+YOUXHvD8pgG9OVRRZtidIJQTVnqR0/2ubdfaVJah6Dz WjNa+urvH/m/2bZ1B7v4qoF+Kyh7G0vdCgDdVodDoJps5HcHZokdUxDCp4OQBF2plfcj Z50JMa9GLo7SLwfhP9L4EweAy6tlJHPAo5tFmq2EtPXxKX+O926At1971y819bAUrZto QAnrJhjvinU8YQP1v8OtlZHYUEZTRqcoUwq20564jE54fXCP/aoAkaDBKOb7RVqLVXo6 DzCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=GDQYyH++QweylOX1ZHJo7Lmbas5HetHFxStbH8WI9Mo=; b=Iez86Ji8Ecs7fwX4F65lW/hKSbTANabWJ6WScMF+FLuIZ5M/yiNy2QpqWCY6Pa3woA ZhcFZ5Lwbvr2l4dpKH5Ft17VBkC4m62HPRKBq2/WWzfnLl8zVjIlKPi6OVaJx9nbbG1c cGXI8SAP9iuPDKhZQ3atMGJU3WGp+iiyc3Xwd1fEgTQ5rKwCTjXT5gi/iIqp92XODOed 9wW6viorYsOlUcYLDY319FnLnaW3qAKcr0GXNn4IH04ObzL7PWJH3mqeUWvaTJzGJTHM 7djQ82ZzvEWWbJRMOwJLh5Zfa28fgWK5URdrwCLnLapryR96gX6v+tZsltwm/w6Lgzc/ CmkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id 144-20020a630596000000b003a9216ffe67si8956836pgf.108.2022.04.22.10.53.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 10:53:57 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 04689DCAB8; Fri, 22 Apr 2022 10:37:24 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1446472AbiDVKVO (ORCPT + 99 others); Fri, 22 Apr 2022 06:21:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1353156AbiDVKVN (ORCPT ); Fri, 22 Apr 2022 06:21:13 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DE1CD30571; Fri, 22 Apr 2022 03:18:20 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A46F81477; Fri, 22 Apr 2022 03:18:20 -0700 (PDT) Received: from [10.57.11.218] (unknown [10.57.11.218]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ACC4D3F766; Fri, 22 Apr 2022 03:18:18 -0700 (PDT) Message-ID: Date: Fri, 22 Apr 2022 11:18:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v2 1/6] coresight: Add config flag to enable branch broadcast Content-Language: en-US To: Mike Leach , Suzuki K Poulose Cc: coresight@lists.linaro.org, leo.yan@linaro.com, John Garry , Will Deacon , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org References: <20220113091056.1297982-1-james.clark@arm.com> <20220113091056.1297982-2-james.clark@arm.com> <7fae6630-fdec-3f95-9eac-3f7a5c789272@arm.com> From: James Clark In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3.0 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/03/2022 15:56, Mike Leach wrote: > Hi James, > > On Fri, 11 Mar 2022 at 14:58, James Clark wrote: >> >> >> >> On 02/02/2022 20:25, Mike Leach wrote: >>> Hi James, Suzuki >>> >>> On Fri, 28 Jan 2022 at 11:19, Suzuki K Poulose wrote: >>>> >>>> On 13/01/2022 09:10, James Clark wrote: >>>>> When enabled, all taken branch addresses are output, even if the branch >>>>> was because of a direct branch instruction. This enables reconstruction >>>>> of the program flow without having access to the memory image of the >>>>> code being executed. >>>>> >>>>> Use bit 8 for the config option which would be the correct bit for >>>>> programming ETMv3. Although branch broadcast can't be enabled on ETMv3 >>>>> because it's not in the define ETM3X_SUPPORTED_OPTIONS, using the >>>>> correct bit might help prevent future collisions or allow it to be >>>>> enabled if needed. >>>>> >>>>> Signed-off-by: James Clark >>>>> --- >>>>> drivers/hwtracing/coresight/coresight-etm-perf.c | 2 ++ >>>>> drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 ++++++++++ >>>>> include/linux/coresight-pmu.h | 2 ++ >>>>> 3 files changed, 14 insertions(+) >>>>> >>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c >>>>> index c039b6ae206f..43bbd5dc3d3b 100644 >>>>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c >>>>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c >>>>> @@ -52,6 +52,7 @@ static DEFINE_PER_CPU(struct coresight_device *, csdev_src); >>>>> * The PMU formats were orignally for ETMv3.5/PTM's ETMCR 'config'; >>>>> * now take them as general formats and apply on all ETMs. >>>>> */ >>>>> +PMU_FORMAT_ATTR(branch_broadcast, "config:"__stringify(ETM_OPT_BRANCH_BROADCAST)); >>>>> PMU_FORMAT_ATTR(cycacc, "config:" __stringify(ETM_OPT_CYCACC)); >>>>> /* contextid1 enables tracing CONTEXTIDR_EL1 for ETMv4 */ >>>>> PMU_FORMAT_ATTR(contextid1, "config:" __stringify(ETM_OPT_CTXTID)); >>>>> @@ -97,6 +98,7 @@ static struct attribute *etm_config_formats_attr[] = { >>>>> &format_attr_sinkid.attr, >>>>> &format_attr_preset.attr, >>>>> &format_attr_configid.attr, >>>>> + &format_attr_branch_broadcast.attr, >>>> >>>> Does it make sense to hide the option if the bb is not supported ? I >>>> guess it will be tricky as we don't track the common feature set. So, >>>> that said...>>>> >>>>> NULL, >>>>> }; >>>>> >>>>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c >>>>> index bf18128cf5de..04669ecc0efa 100644 >>>>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c >>>>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c >>>>> @@ -692,6 +692,16 @@ static int etm4_parse_event_config(struct coresight_device *csdev, >>>>> ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset); >>>>> } >>>>> >>>>> + /* branch broadcast - enable if selected and supported */ >>>>> + if (attr->config & BIT(ETM_OPT_BRANCH_BROADCAST)) { >>>>> + if (!drvdata->trcbb) { >>>>> + ret = -EINVAL; >>>> >>>> Should we fail here ? We could simply ignore this and generate the trace >>>> normally. This would work on a big.LITTLE system with one set missing >>>> the branch broadcast, while the others support. Hi Suzuki, I think if the user really needs branch broadcast because they've modified their binaries then they would want feedback that one of the cores doesn't support it. Otherwise their decode would silently go wrong and they could get stuck. Perf already opens an event per core, so it wouldn't be much effort to manually modify the command line to only select cores where it is supported. I think for that scenario the current patch already works that way because the feature is checked on the active core when the event is enabled? James >>>> >>>> Mike, >>>> >>>> Does this affect the trace decoding ? As such the OpenCSD should be able >>>> to decode the packets as they appear in the stream. Correct ? >>>> >>> >>> Depends on what you mean by affect the trace decoding! >>> From the simplest perspective - no - there is no issue as the packets >>> will be decode as seen. THE decoder does not know that BB exists - nor >>> if it is enabled. >>> >>> However, the reason that a user may engage BB is that the code is in >>> some way self modifying - so that following the code static image and >>> calculating addresses will result in a different path taken. >>> >>> e.g. imagine an opcode:- >>> >>> B >>> >>> Without BB, this will trace as a single E atom, the decoder will >>> calculate address0 from the opcode in the static image and continue >>> from there as the next trace address. >>> >>> Now look at the case where this is changed on the fly to >>> >>> B >>> >>> With BB, This will trace to >>> E >>> TGT_ADDR >>> >>> The decoder will initially extract address0 from the static image, >>> but the immediately following target address packet will alter the >>> next address traced to address1 >>> This is why we have BB. >>> >>> So if the user has a reason to engage BB - we should really fail if >>> it is not present - as the outcome of the trace can be affected. >> >> Hi Mike, >> >> Now I'm starting to wonder if it's best to have some kind of non-binary >> image mode for BB where you'd just get a list of addresses output by >> perf script and it doesn't attempt to do any lookups. > > Not at all sure what you mean by this Hi Mike, I'm not sure if I really understood it either! Now I see what you mean by the decoding issues in Perf and how it doesn't make sense if there is no method to supply modified images. Are you ok with adding this if I add to the new documentation that there is currently no support in Perf and the behavior will be wrong if the binary was modified? But maybe other users of perf_event_open might find it useful? Thanks James > > Mike > >> Although I think >> that would require a change in OpenCSD if it's not aware of the mode? >> >> I also need to go back to my JVM profiling test and see what's >> going on again. But I think I understand your points a bit more now. >> >> Thanks >> James >> >>> >>>> Suzuki >>>> >>>> >>>>> + goto out; >>>>> + } else { >>>>> + config->cfg |= BIT(ETM4_CFG_BIT_BB); >>>>> + } >>>>> + } >>>>> + >>>>> out: >>>>> return ret; >>>>> } >>>>> diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h >>>>> index 4ac5c081af93..6c2fd6cc5a98 100644 >>>>> --- a/include/linux/coresight-pmu.h >>>>> +++ b/include/linux/coresight-pmu.h >>>>> @@ -18,6 +18,7 @@ >>>>> * ETMv3.5/PTM doesn't define ETMCR config bits with prefix "ETM3_" and >>>>> * directly use below macros as config bits. >>>>> */ >>>>> +#define ETM_OPT_BRANCH_BROADCAST 8 >>>>> #define ETM_OPT_CYCACC 12 >>>>> #define ETM_OPT_CTXTID 14 >>>>> #define ETM_OPT_CTXTID2 15 >>>>> @@ -25,6 +26,7 @@ >>>>> #define ETM_OPT_RETSTK 29 >>>>> >>>>> /* ETMv4 CONFIGR programming bits for the ETM OPTs */ >>>>> +#define ETM4_CFG_BIT_BB 3 >>>>> #define ETM4_CFG_BIT_CYCACC 4 >>>>> #define ETM4_CFG_BIT_CTXTID 6 >>>>> #define ETM4_CFG_BIT_VMID 7 >>>> >>> >>> > > >