by Arnaldo Carvalho de Melo

[permalink] [raw]

Subject: Re: [PATCH V5 2/8] perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag

Em Wed, Oct 25, 2023 at 01:16:20PM -0700, [email protected] escreveu:
> From: Kan Liang <[email protected]>
>
> Currently, branch_sample_type !=0 is used to check whether a branch
> stack setup is required. But it doesn't check the sample type,
> unnecessary branch stack setup may be done for a counting event. E.g.,
> perf record -e "{branch-instructions,branch-misses}:S" -j any
> Also, the event only with the new PERF_SAMPLE_BRANCH_COUNTERS branch
> sample type may not require a branch stack setup either.
>
> Add a new flag NEEDS_BRANCH_STACK to indicate whether the event requires
> a branch stack setup. Replace the needs_branch_stack() by checking the
> new flag.
>
> The counting event check is implemented here. The later patch will take
> the new PERF_SAMPLE_BRANCH_COUNTERS into account.
>
> Signed-off-by: Kan Liang <[email protected]>
> ---
>
> No changes since V4

So I saw this on tip/perf/urgent, I'm picking the tools bits then.

- Arnaldo

> arch/x86/events/intel/core.c | 14 +++++++++++---
> arch/x86/events/perf_event_flags.h | 1 +
> 2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 41a164764a84..a99449c0d77c 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -2527,9 +2527,14 @@ static void intel_pmu_assign_event(struct perf_event *event, int idx)
> perf_report_aux_output_id(event, idx);
> }
>
> +static __always_inline bool intel_pmu_needs_branch_stack(struct perf_event *event)
> +{
> + return event->hw.flags & PERF_X86_EVENT_NEEDS_BRANCH_STACK;
> +}
> +
> static void intel_pmu_del_event(struct perf_event *event)
> {
> - if (needs_branch_stack(event))
> + if (intel_pmu_needs_branch_stack(event))
> intel_pmu_lbr_del(event);
> if (event->attr.precise_ip)
> intel_pmu_pebs_del(event);
> @@ -2820,7 +2825,7 @@ static void intel_pmu_add_event(struct perf_event *event)
> {
> if (event->attr.precise_ip)
> intel_pmu_pebs_add(event);
> - if (needs_branch_stack(event))
> + if (intel_pmu_needs_branch_stack(event))
> intel_pmu_lbr_add(event);
> }
>
> @@ -3897,7 +3902,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
> x86_pmu.pebs_aliases(event);
> }
>
> - if (needs_branch_stack(event)) {
> + if (needs_branch_stack(event) && is_sampling_event(event))
> + event->hw.flags |= PERF_X86_EVENT_NEEDS_BRANCH_STACK;
> +
> + if (intel_pmu_needs_branch_stack(event)) {
> ret = intel_pmu_setup_lbr_filter(event);
> if (ret)
> return ret;
> diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
> index 1dc19b9b4426..a1685981c520 100644
> --- a/arch/x86/events/perf_event_flags.h
> +++ b/arch/x86/events/perf_event_flags.h
> @@ -20,3 +20,4 @@ PERF_ARCH(TOPDOWN, 0x04000) /* Count Topdown slots/metrics events */
> PERF_ARCH(PEBS_STLAT, 0x08000) /* st+stlat data address sampling */
> PERF_ARCH(AMD_BRS, 0x10000) /* AMD Branch Sampling */
> PERF_ARCH(PEBS_LAT_HYBRID, 0x20000) /* ld and st lat for hybrid */
> +PERF_ARCH(NEEDS_BRANCH_STACK, 0x40000) /* require branch stack setup */
> --
> 2.35.1
>

--

- Arnaldo

2023-11-06 21:19:33

by Liang, Kan

[permalink] [raw]

Subject: Re: [PATCH V5 2/8] perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag

On 2023-11-06 4:12 p.m., Arnaldo Carvalho de Melo wrote:
> Em Wed, Oct 25, 2023 at 01:16:20PM -0700, [email protected] escreveu:
>> From: Kan Liang <[email protected]>
>>
>> Currently, branch_sample_type !=0 is used to check whether a branch
>> stack setup is required. But it doesn't check the sample type,
>> unnecessary branch stack setup may be done for a counting event. E.g.,
>> perf record -e "{branch-instructions,branch-misses}:S" -j any
>> Also, the event only with the new PERF_SAMPLE_BRANCH_COUNTERS branch
>> sample type may not require a branch stack setup either.
>>
>> Add a new flag NEEDS_BRANCH_STACK to indicate whether the event requires
>> a branch stack setup. Replace the needs_branch_stack() by checking the
>> new flag.
>>
>> The counting event check is implemented here. The later patch will take
>> the new PERF_SAMPLE_BRANCH_COUNTERS into account.
>>
>> Signed-off-by: Kan Liang <[email protected]>
>> ---
>>
>> No changes since V4
>
> So I saw this on tip/perf/urgent, I'm picking the tools bits then.

Thanks Arnaldo.

Ian has already reviewed the tool parts.

But I still owe a test case for the feature. I will post a patch later.
https://lore.kernel.org/lkml/[email protected]/

Thanks,
Kan

>
> - Arnaldo
>
>> arch/x86/events/intel/core.c | 14 +++++++++++---
>> arch/x86/events/perf_event_flags.h | 1 +
>> 2 files changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 41a164764a84..a99449c0d77c 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -2527,9 +2527,14 @@ static void intel_pmu_assign_event(struct perf_event *event, int idx)
>> perf_report_aux_output_id(event, idx);
>> }
>>
>> +static __always_inline bool intel_pmu_needs_branch_stack(struct perf_event *event)
>> +{
>> + return event->hw.flags & PERF_X86_EVENT_NEEDS_BRANCH_STACK;
>> +}
>> +
>> static void intel_pmu_del_event(struct perf_event *event)
>> {
>> - if (needs_branch_stack(event))
>> + if (intel_pmu_needs_branch_stack(event))
>> intel_pmu_lbr_del(event);
>> if (event->attr.precise_ip)
>> intel_pmu_pebs_del(event);
>> @@ -2820,7 +2825,7 @@ static void intel_pmu_add_event(struct perf_event *event)
>> {
>> if (event->attr.precise_ip)
>> intel_pmu_pebs_add(event);
>> - if (needs_branch_stack(event))
>> + if (intel_pmu_needs_branch_stack(event))
>> intel_pmu_lbr_add(event);
>> }
>>
>> @@ -3897,7 +3902,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
>> x86_pmu.pebs_aliases(event);
>> }
>>
>> - if (needs_branch_stack(event)) {
>> + if (needs_branch_stack(event) && is_sampling_event(event))
>> + event->hw.flags |= PERF_X86_EVENT_NEEDS_BRANCH_STACK;
>> +
>> + if (intel_pmu_needs_branch_stack(event)) {
>> ret = intel_pmu_setup_lbr_filter(event);
>> if (ret)
>> return ret;
>> diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
>> index 1dc19b9b4426..a1685981c520 100644
>> --- a/arch/x86/events/perf_event_flags.h
>> +++ b/arch/x86/events/perf_event_flags.h
>> @@ -20,3 +20,4 @@ PERF_ARCH(TOPDOWN, 0x04000) /* Count Topdown slots/metrics events */
>> PERF_ARCH(PEBS_STLAT, 0x08000) /* st+stlat data address sampling */
>> PERF_ARCH(AMD_BRS, 0x10000) /* AMD Branch Sampling */
>> PERF_ARCH(PEBS_LAT_HYBRID, 0x20000) /* ld and st lat for hybrid */
>> +PERF_ARCH(NEEDS_BRANCH_STACK, 0x40000) /* require branch stack setup */
>> --
>> 2.35.1
>>
>