Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp370006rwb; Mon, 28 Nov 2022 22:47:19 -0800 (PST) X-Google-Smtp-Source: AA0mqf6gdUYoPVDggvA/AUX78qka+rxAdZSrzA88vnUeubm79Hr4zeGaz8oRXJ2IWnTq26lXEZp3 X-Received: by 2002:a17:902:74c3:b0:189:71ff:cfb5 with SMTP id f3-20020a17090274c300b0018971ffcfb5mr16471495plt.7.1669704439009; Mon, 28 Nov 2022 22:47:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669704439; cv=none; d=google.com; s=arc-20160816; b=RQYm5OPP39tLxfWoeKpF+cE+FwpnRih9CtL0osEr7lcTOcAAuyZthjnAbc1gMDfztU OGb2eyAUfOSoe+nlz/WorzMdwI3Li9ID0/QmZv8vOqLhnBY30uQUJDtv7gjcjQGhOzrJ sYiT+NJlSWaYY/j+Iud2YMhC6UgPHPpEUsaBSHtDLIpUrSSd4XBz7s9mDYLUEcKkwRsz mqqr1Ce5mLnPa46UKMmIYIW1qxPJPkagZiMRYDOdrStKUkLHTjyaw7U477C8vDdLx7eY lcADFO8tirrOq7R72LOhpf1Kni8IiGxZJV8/5nCiO2503cPMSRf+kL6JTN+anTdomEpf fXVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=BwtAw+hbif0bVnF9Wgk1CoIUL4xaS/NtMMgRsZEsEx8=; b=bpL9sFay/+MIgOUtCtSTCL0s34DdpP8QpkYobxPcieo3bXd/WzTcQxNmUsIZcfo31n zmBNOErStpP26Vpj9sm+mFZ+TRRgZ9wRZYfS4V07DvfTCZv2Mg0WupWcvy2hSUjA2DeV 33sSMaEccIKRzjD7G4Sanuh0N5bmrPBucVLnofxNk46vUkhuUv+k4Oq/ZvDWS9snk9HG olbscXeVBbpzv+zDqi36lZuvzdva7AH8fM3Dd2kKw+7DA2ugHWKtZLOuj4SFGBZrny1/ 6H1pj09yV7TgB+pYYY/E8z3X659bxDr4YvO/qsw7O/8xT5tWtFyDahKfIhe7b0PbJre0 hn4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ip9-20020a17090b314900b00218d473a466si984844pjb.160.2022.11.28.22.47.08; Mon, 28 Nov 2022 22:47:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235694AbiK2G1D (ORCPT + 82 others); Tue, 29 Nov 2022 01:27:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233838AbiK2G1C (ORCPT ); Tue, 29 Nov 2022 01:27:02 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 012B4532C0; Mon, 28 Nov 2022 22:27:01 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5613AD6E; Mon, 28 Nov 2022 22:27:07 -0800 (PST) Received: from [10.162.41.67] (unknown [10.162.41.67]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6C7343F67D; Mon, 28 Nov 2022 22:26:56 -0800 (PST) Message-ID: <5e7fdfd1-4b1f-e6ff-0ef2-1be00aad8cf3@arm.com> Date: Tue, 29 Nov 2022 11:56:53 +0530 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH V5 5/7] arm64/perf: Drive BRBE from perf event states Content-Language: en-US To: Mark Rutland Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-arm-kernel@lists.infradead.org, peterz@infradead.org, acme@kernel.org, will@kernel.org, catalin.marinas@arm.com, Mark Brown , James Clark , Rob Herring , Marc Zyngier , Suzuki Poulose , Ingo Molnar References: <20221107062514.2851047-1-anshuman.khandual@arm.com> <20221107062514.2851047-6-anshuman.khandual@arm.com> From: Anshuman Khandual In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/18/22 23:45, Mark Rutland wrote: > On Mon, Nov 07, 2022 at 11:55:12AM +0530, Anshuman Khandual wrote: >> Branch stack sampling rides along the normal perf event and all the branch >> records get captured during the PMU interrupt. This just changes perf event >> handling on the arm64 platform to accommodate required BRBE operations that >> will enable branch stack sampling support. >> >> Cc: Peter Zijlstra >> Cc: Ingo Molnar >> Cc: Arnaldo Carvalho de Melo >> Cc: Mark Rutland >> Cc: Will Deacon >> Cc: Catalin Marinas >> Cc: linux-perf-users@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Cc: linux-arm-kernel@lists.infradead.org >> Signed-off-by: Anshuman Khandual >> --- >> arch/arm64/kernel/perf_event.c | 7 ++++++ >> drivers/perf/arm_pmu.c | 40 ++++++++++++++++++++++++++++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c >> index c97377e28288..97db333d1208 100644 >> --- a/arch/arm64/kernel/perf_event.c >> +++ b/arch/arm64/kernel/perf_event.c >> @@ -874,6 +874,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu) >> if (!armpmu_event_set_period(event)) >> continue; >> >> + if (has_branch_stack(event)) { >> + cpu_pmu->brbe_read(cpuc, event); >> + data.br_stack = &cpuc->branches->brbe_stack; >> + data.sample_flags |= PERF_SAMPLE_BRANCH_STACK; >> + cpu_pmu->brbe_reset(cpuc); >> + } >> + >> /* >> * Perf event overflow will queue the processing of the event as >> * an irq_work which will be taken care of in the handling of >> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c >> index 5048a500441e..1a8dca4e513e 100644 >> --- a/drivers/perf/arm_pmu.c >> +++ b/drivers/perf/arm_pmu.c >> @@ -271,12 +271,22 @@ armpmu_stop(struct perf_event *event, int flags) >> { >> struct arm_pmu *armpmu = to_arm_pmu(event->pmu); >> struct hw_perf_event *hwc = &event->hw; >> + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); >> >> /* >> * ARM pmu always has to update the counter, so ignore >> * PERF_EF_UPDATE, see comments in armpmu_start(). >> */ >> if (!(hwc->state & PERF_HES_STOPPED)) { >> + if (has_branch_stack(event)) { >> + WARN_ON_ONCE(!hw_events->brbe_users); >> + hw_events->brbe_users--; >> + if (!hw_events->brbe_users) { >> + hw_events->brbe_context = NULL; >> + armpmu->brbe_disable(hw_events); >> + } >> + } > > Can't we do the actual enable/disable we start/stop the PMU as a whole? > > If we just counted the numberoof users here we could do the actual > enable/disable in armpmu_{enable,disable}() or armv8pmu_{start,stop}(), like we > do when checking hw_events->used_mask. Right, it can be moved inside armv8pmu_{enable, disable} diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index 7b0643fe2f13..e64832bc83ba 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -789,6 +789,19 @@ static void armv8pmu_enable_event(struct perf_event *event) * Enable counter */ armv8pmu_enable_event_counter(event); + + if (has_branch_stack(event)) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); + + if (event->ctx->task && hw_events->brbe_context != event->ctx) { + arm64_pmu_brbe_reset(hw_events); + hw_events->brbe_context = event->ctx; + } + hw_events->brbe_users++; + arm64_pmu_brbe_enable(event); + } } static void armv8pmu_disable_event(struct perf_event *event) @@ -802,6 +815,14 @@ static void armv8pmu_disable_event(struct perf_event *event) * Disable interrupt for this counter */ armv8pmu_disable_event_irq(event); + + if (has_branch_stack(event)) { + struct arm_pmu *armpmu = to_arm_pmu(event->pmu); + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); + + if (!hw_events->brbe_users) + arm64_pmu_brbe_disable(event); + } } > > [...] > >> @@ -349,6 +368,10 @@ armpmu_add(struct perf_event *event, int flags) >> hw_events->events[idx] = event; >> >> hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; >> + >> + if (has_branch_stack(event)) >> + armpmu->brbe_filter(hw_events, event); > > What exactly do we need to do here? Since the BRBE is shared, I'm suprised that > there's any pwer-event configuration beyond "yes" or "no". Entire arm64_pmu_brbe_filter() can be moved inside arm64_pmu_brbe_enable(). > >> + >> if (flags & PERF_EF_START) >> armpmu_start(event, PERF_EF_RELOAD); >> >> @@ -443,6 +466,7 @@ __hw_perf_event_init(struct perf_event *event) >> { >> struct arm_pmu *armpmu = to_arm_pmu(event->pmu); >> struct hw_perf_event *hwc = &event->hw; >> + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); >> int mapping; >> >> hwc->flags = 0; >> @@ -492,6 +516,9 @@ __hw_perf_event_init(struct perf_event *event) >> local64_set(&hwc->period_left, hwc->sample_period); >> } >> >> + if (has_branch_stack(event)) >> + armpmu->brbe_filter(hw_events, event); > > I do not understand why we would use hw_events here; at this point the event > has only been created, and not even added yet; it doesn't have a counter index. > isn't even being installed into HW. > > What am I missing? perf event's requested branch sampling attributes need to be transferred into the BRBE HW config, when the event gets scheduled on a given CPU. But this transfer happens in two stages. event -----> pmu_hw_events ----> BRBE HW 1. event->attr.branch_sample_type --> pmu_hw_events.[brbcr,brbfcr] (event add) 2. pmu_hw_events.[brbcr,brbfcr] --> BRBXXX_EL1 registers (event enable) But I guess both these stages can be done in event_enable() and the intermediary pmu_hw_events.[brbcr,brbfcr] elements can also be dropped. > >> + >> return validate_group(event); >> } >> >> @@ -520,6 +547,18 @@ static int armpmu_event_init(struct perf_event *event) >> return __hw_perf_event_init(event); >> } >> >> +static void armpmu_sched_task(struct perf_event_context *ctx, bool sched_in) >> +{ >> + struct arm_pmu *armpmu = to_arm_pmu(ctx->pmu); >> + struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events); >> + >> + if (!hw_events->brbe_users) >> + return; >> + >> + if (sched_in) >> + armpmu->brbe_reset(hw_events); > > I see that LBR does a save/restore, whereas IIUC here we discard without even > reading the old values. Is that the intent? Shouldn't we snapshot them into the > task context? Yes, currently that is the intent. We cannot let the branch records from one task slip across into another task corrupting the capture on the later. Only Intel LBR implements such task context save/restore via hw_events, only when LBR call stack feature is supported otherwise, all other sched_task() implementation just flush or invalidate captured branches before changing the context. We could implement such BRBE context store/restore mechanism via given BRBXXXINJ registers/instructions. But this would make our context switch bit expensive and also might not provide substantial benefit as well. Although this can be explored later after the initial enablement.