2019-10-25 14:01:13

by Tan Xiaojun

[permalink] [raw]
Subject: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

At the suggestion of James Clark, use spe to support the precise
ip of some events. Currently its support event is:
branch-misses.

Example usage:

$ ./perf record -e branch-misses:pp dd if=/dev/zero of=/dev/null count=10000
(:p/pp/ppp is same for this case.)

$ ./perf report --stdio
("--stdio is not necessary")

--------------------------------------------------------------------
...
# Samples: 14 of event 'branch-misses:pp'
# Event count (approx.): 14
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..........................
#
14.29% 14.29% dd [kernel.kallsyms] [k] __arch_copy_from_user
14.29% 14.29% dd libc-2.28.so [.] _dl_addr
7.14% 7.14% dd [kernel.kallsyms] [k] __free_pages
7.14% 7.14% dd [kernel.kallsyms] [k] __pi_memcpy
7.14% 7.14% dd [kernel.kallsyms] [k] pagecache_get_page
7.14% 7.14% dd [kernel.kallsyms] [k] unmap_single_vma
7.14% 7.14% dd dd [.] 0x00000000000025ec
7.14% 7.14% dd ld-2.28.so [.] _dl_lookup_symbol_x
7.14% 7.14% dd ld-2.28.so [.] check_match
7.14% 7.14% dd libc-2.28.so [.] __mpn_rshift
7.14% 7.14% dd libc-2.28.so [.] _nl_intern_locale_data
7.14% 7.14% dd libc-2.28.so [.] read_alias_file
...
--------------------------------------------------------------------

Signed-off-by: Tan Xiaojun <[email protected]>
---
tools/perf/util/arm-spe.c | 44 +++++++++++++++++++++++++++++++++++++++
tools/perf/util/arm-spe.h | 3 +++
tools/perf/util/evlist.c | 2 ++
3 files changed, 49 insertions(+)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 596a48df6f4e..9851d1ed6d75 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -35,6 +35,19 @@

#define MAX_TIMESTAMP (~0ULL)

+#define SPE_ATTR_TS_ENABLE BIT(0)
+#define SPE_ATTR_PA_ENABLE BIT(1)
+#define SPE_ATTR_PCT_ENABLE BIT(2)
+#define SPE_ATTR_JITTER BIT(16)
+#define SPE_ATTR_BRANCH_FILTER BIT(32)
+#define SPE_ATTR_LOAD_FILTER BIT(33)
+#define SPE_ATTR_STORE_FILTER BIT(34)
+
+#define SPE_ATTR_EV_RETIRED BIT(1)
+#define SPE_ATTR_EV_CACHE BIT(3)
+#define SPE_ATTR_EV_TLB BIT(5)
+#define SPE_ATTR_EV_BRANCH BIT(7)
+
struct arm_spe {
struct auxtrace auxtrace;
struct auxtrace_queues queues;
@@ -771,6 +784,15 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
attr.sample_id_all = evsel->core.attr.sample_id_all;
attr.read_format = evsel->core.attr.read_format;

+ /* If it is in the precise ip mode, there is no need to
+ * synthesize new events. */
+ if (!strncmp(evsel->name, "branch-misses", 13)) {
+ spe->sample_branch_miss = true;
+ spe->branch_miss_id = evsel->core.id[0];
+
+ return 0;
+ }
+
/* create new id val to be a fixed offset from evsel id */
id = evsel->core.id[0] + 1000000000;

@@ -880,3 +902,25 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
free(spe);
return err;
}
+
+void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel)
+{
+ struct perf_pmu *pmu;
+
+ /* Currently only supports precise_ip for branch-misses on arm64 */
+ if (!strcmp(perf_env__arch(evlist->env), "arm64")
+ && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
+ && evsel->core.attr.precise_ip) {
+ pmu = perf_pmu__find("arm_spe_0");
+ if (pmu) {
+ evsel->pmu_name = pmu->name;
+ evsel->core.attr.type = PERF_RECORD_AUXTRACE;
+ evsel->core.attr.config = SPE_ATTR_TS_ENABLE
+ | SPE_ATTR_PA_ENABLE
+ | SPE_ATTR_JITTER
+ | SPE_ATTR_BRANCH_FILTER;
+ evsel->core.attr.precise_ip = 0;
+ evsel->core.attr.config1 = SPE_ATTR_EV_BRANCH;
+ }
+ }
+}
diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
index 98d3235781c3..8b1fb191d03a 100644
--- a/tools/perf/util/arm-spe.h
+++ b/tools/perf/util/arm-spe.h
@@ -20,6 +20,8 @@ enum {
union perf_event;
struct perf_session;
struct perf_pmu;
+struct evlist;
+struct evsel;

struct auxtrace_record *arm_spe_recording_init(int *err,
struct perf_pmu *arm_spe_pmu);
@@ -28,4 +30,5 @@ int arm_spe_process_auxtrace_info(union perf_event *event,
struct perf_session *session);

struct perf_event_attr *arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu);
+void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel);
#endif
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d277a98e62df..8a83d2b98209 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -9,6 +9,7 @@
#include <errno.h>
#include <inttypes.h>
#include <poll.h>
+#include "arm-spe.h"
#include "cpumap.h"
#include "util/mmap.h"
#include "thread_map.h"
@@ -181,6 +182,7 @@ void perf_evlist__splice_list_tail(struct evlist *evlist,
struct evsel *evsel, *temp;

__evlist__for_each_entry_safe(list, temp, evsel) {
+ arm_spe_precise_ip_support(evlist, evsel);
list_del_init(&evsel->core.node);
evlist__add(evlist, evsel);
}
--
2.17.1


2019-10-25 19:22:15

by Tan Xiaojun

[permalink] [raw]
Subject: Re: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

On 2019/10/25 0:29, James Clark wrote:
> Hi Xiaojun,
>
> This looks really good. I tried this, but got an error:
>
>     ./perf record -e branch-misses:p ls
>     Error:
>     The branch-misses:p event is not supported.

Hi, James,

I can't reproduce this problem. If the current system doesn't support spe, it shouldn't report an error. I use the latest codes of the mainline:

commit f116b96685a046a89c25d4a6ba2da489145c8888 (mainline/master)
Merge: f632bfaa33ed 603d9299da32
Author: Linus Torvalds <[email protected]>
Date: Thu Oct 24 06:13:45 2019 -0400

Merge tag 'mfd-fixes-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd

I will go and see why this will be reported.

>
>
> I would have expected to use the event name that is listed in the SPE documentation for branch misses which is br_mis_pred or br_mis_pred_retired:
>
>     E[7], byte 0 bit [7]
>     Mispredicted. The defined values of this bit are:
>     0 Did not cause correction to the predicted program flow.
>     1 A branch that caused a correction to the predicted program flow.
>
>     If PMUv3 is implemented this Event is required to be implemented consistently with either BR_MIS_PRED or BR_MIS_PRED_RETIRED.
>

Do you mean that I can add these as new events to perf? If we think of them as new events, what should we do if the user does not add :pp for them?
(Or for these events, users can only add :pp to use them?)

>
> +       if (!strcmp(perf_env__arch(evlist->env), "arm64")
> +                       && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
> +                       && evsel->core.attr.precise_ip) {
>
> As I mentioned above PERF_COUNT_HW_BRANCH_MISSESdoesn't seem to match up with the actual event counter that is associated with this SPE event (BR_MIS_PRED). The fix for this is probably as simple as adding an OR for the other aliases for branch mispredicts.

What you mean is that we can filter with spe events(like BR_MIS_PRED) first, and if we have other events that are exactly the same(no more for now), then we can handle them by adding OR in the future?

>
> +               pmu = perf_pmu__find("arm_spe_0");
> +               if (pmu) {
> +                       evsel->pmu_name = pmu->name;
> +                       evsel->core.attr.type = PERF_RECORD_AUXTRACE;
> +                       evsel->core.attr.config = SPE_ATTR_TS_ENABLE
> +                                               | SPE_ATTR_PA_ENABLE
>
> I wouldn't set physical addresses by default as this requires root. I would leave that to the user if they want to manually configure SPE.

Yes. You are right. I got a error for this case. I will fix it.

------------------
./perf record -e branch-misses:p ls
Error:
You may not have permission to collect stats.
...
------------------

Thanks.
Xiaojun.

>
> I have only looked briefly and I will do some more testing.
>
>
> Thanks
> James
>
>


2019-11-13 14:50:27

by James Clark

[permalink] [raw]
Subject: Re: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

Hi Xiaojun,

> I can't reproduce this problem. If the current system doesn't support spe, it shouldn't report an error. I use the latest codes of the mainline:

I think the problem is related to the 'type' attribute of the event. To open the SPE PMU the event type on the platform I'm using is '7'. If I change
the code like this, the problem is fixed:

@@ -914,13 +914,27 @@ void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel)
pmu = perf_pmu__find("arm_spe_0");
if (pmu) {
evsel->pmu_name = pmu->name;
- evsel->core.attr.type = PERF_RECORD_AUXTRACE;
- evsel->core.attr.config = SPE_ATTR_TS_ENABLE
- | SPE_ATTR_PA_ENABLE
- | SPE_ATTR_JITTER
+ evsel->core.attr.type = pmu->type;
+ evsel->core.attr.config |= SPE_ATTR_TS_ENABLE
| SPE_ATTR_BRANCH_FILTER;

Also do you think jitter should be enabled by default? I thought that it might make the data less precise, so I removed it here.

-James

>
> commit f116b96685a046a89c25d4a6ba2da489145c8888 (mainline/master)
> Merge: f632bfaa33ed 603d9299da32
> Author: Linus Torvalds <[email protected]>
> Date: Thu Oct 24 06:13:45 2019 -0400
>
> Merge tag 'mfd-fixes-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>
> I will go and see why this will be reported.
>
>>
>>
>> I would have expected to use the event name that is listed in the SPE documentation for branch misses which is br_mis_pred or br_mis_pred_retired:
>>
>>     E[7], byte 0 bit [7]
>>     Mispredicted. The defined values of this bit are:
>>     0 Did not cause correction to the predicted program flow.
>>     1 A branch that caused a correction to the predicted program flow.
>>
>>     If PMUv3 is implemented this Event is required to be implemented consistently with either BR_MIS_PRED or BR_MIS_PRED_RETIRED.
>>
>
> Do you mean that I can add these as new events to perf? If we think of them as new events, what should we do if the user does not add :pp for them?
> (Or for these events, users can only add :pp to use them?)
>
>>
>> +       if (!strcmp(perf_env__arch(evlist->env), "arm64")
>> +                       && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
>> +                       && evsel->core.attr.precise_ip) {
>>
>> As I mentioned above PERF_COUNT_HW_BRANCH_MISSESdoesn't seem to match up with the actual event counter that is associated with this SPE event (BR_MIS_PRED). The fix for this is probably as simple as adding an OR for the other aliases for branch mispredicts.
>
> What you mean is that we can filter with spe events(like BR_MIS_PRED) first, and if we have other events that are exactly the same(no more for now), then we can handle them by adding OR in the future?
>
>>
>> +               pmu = perf_pmu__find("arm_spe_0");
>> +               if (pmu) {
>> +                       evsel->pmu_name = pmu->name;
>> +                       evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>> +                       evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>> +                                               | SPE_ATTR_PA_ENABLE
>>
>> I wouldn't set physical addresses by default as this requires root. I would leave that to the user if they want to manually configure SPE.
>
> Yes. You are right. I got a error for this case. I will fix it.
>
> ------------------
> ./perf record -e branch-misses:p ls
> Error:
> You may not have permission to collect stats.
> ...
> ------------------
>
> Thanks.
> Xiaojun.
>
>>
>> I have only looked briefly and I will do some more testing.
>>
>>
>> Thanks
>> James
>>
>>
>
>

2019-11-15 03:05:08

by Tan Xiaojun

[permalink] [raw]
Subject: Re: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

On 2019/11/13 22:47, James Clark wrote:
> Hi Xiaojun,
>
>> I can't reproduce this problem. If the current system doesn't support spe, it shouldn't report an error. I use the latest codes of the mainline:
>
> I think the problem is related to the 'type' attribute of the event. To open the SPE PMU the event type on the platform I'm using is '7'. If I change
> the code like this, the problem is fixed:
>
> @@ -914,13 +914,27 @@ void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel)
> pmu = perf_pmu__find("arm_spe_0");
> if (pmu) {
> evsel->pmu_name = pmu->name;
> - evsel->core.attr.type = PERF_RECORD_AUXTRACE;
> - evsel->core.attr.config = SPE_ATTR_TS_ENABLE
> - | SPE_ATTR_PA_ENABLE
> - | SPE_ATTR_JITTER
> + evsel->core.attr.type = pmu->type;
> + evsel->core.attr.config |= SPE_ATTR_TS_ENABLE
> | SPE_ATTR_BRANCH_FILTER;
>

Hi, James,
OK. Thank you for your fix.

> Also do you think jitter should be enabled by default? I thought that it might make the data less precise, so I removed it here.

Since the interval for sampling without "jitter" is fixed (default 1024 on our server), I was worried that not adding it would result in the same result for each record, and some instructions could not be collected each time.

However, after many tests, it is not clear from the results that there is a significant difference between them (enable it or not).

So I am confused, whether to enable it or not.

Thanks.
Xiaojun.

>
> -James
>
>>
>> commit f116b96685a046a89c25d4a6ba2da489145c8888 (mainline/master)
>> Merge: f632bfaa33ed 603d9299da32
>> Author: Linus Torvalds <[email protected]>
>> Date: Thu Oct 24 06:13:45 2019 -0400
>>
>> Merge tag 'mfd-fixes-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>
>> I will go and see why this will be reported.
>>
>>>
>>>
>>> I would have expected to use the event name that is listed in the SPE documentation for branch misses which is br_mis_pred or br_mis_pred_retired:
>>>
>>>     E[7], byte 0 bit [7]
>>>     Mispredicted. The defined values of this bit are:
>>>     0 Did not cause correction to the predicted program flow.
>>>     1 A branch that caused a correction to the predicted program flow.
>>>
>>>     If PMUv3 is implemented this Event is required to be implemented consistently with either BR_MIS_PRED or BR_MIS_PRED_RETIRED.
>>>
>>
>> Do you mean that I can add these as new events to perf? If we think of them as new events, what should we do if the user does not add :pp for them?
>> (Or for these events, users can only add :pp to use them?)
>>
>>>
>>> +       if (!strcmp(perf_env__arch(evlist->env), "arm64")
>>> +                       && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
>>> +                       && evsel->core.attr.precise_ip) {
>>>
>>> As I mentioned above PERF_COUNT_HW_BRANCH_MISSESdoesn't seem to match up with the actual event counter that is associated with this SPE event (BR_MIS_PRED). The fix for this is probably as simple as adding an OR for the other aliases for branch mispredicts.
>>
>> What you mean is that we can filter with spe events(like BR_MIS_PRED) first, and if we have other events that are exactly the same(no more for now), then we can handle them by adding OR in the future?
>>
>>>
>>> +               pmu = perf_pmu__find("arm_spe_0");
>>> +               if (pmu) {
>>> +                       evsel->pmu_name = pmu->name;
>>> +                       evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>>> +                       evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>>> +                                               | SPE_ATTR_PA_ENABLE
>>>
>>> I wouldn't set physical addresses by default as this requires root. I would leave that to the user if they want to manually configure SPE.
>>
>> Yes. You are right. I got a error for this case. I will fix it.
>>
>> ------------------
>> ./perf record -e branch-misses:p ls
>> Error:
>> You may not have permission to collect stats.
>> ...
>> ------------------
>>
>> Thanks.
>> Xiaojun.
>>
>>>
>>> I have only looked briefly and I will do some more testing.
>>>
>>>
>>> Thanks
>>> James
>>>
>>>
>>
>>


2019-11-15 11:39:45

by James Clark

[permalink] [raw]
Subject: Re: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

Hi Xiaojun,

If the difference is not noticeable I think it would be better to leave it disabled. Presumably if the user
supplies the ":p" argument they are interested in the data being as precise as possible.

If they want to enable jitter, then can always configure the SPE event manually.

I have a question about what kind of approach you think we should take for multiple events that are provided with :p.
For example "perf record -e branch-misses:p -e cache-misses:p ...". In your current implementation this will
give the error "There may be only one SPE event". I think this is fine for a first implementation. But I wonder if there
is a way of supporting multiple SPE events?

From the documentation it seems like the filter events are ANDed together:

PMSEVFR_EL1.
Controls sample filtering by events. The overall filter is the logical AND of these filters. For example, if E[3] and E[5] are both set to 1,
only samples that have both event 3 (Level 1 unified or data cache refill) and event 5 set (TLB walk) are recorded

Which means that if we kept adding filters for new event types, there would be no events received because they wouldn't satisfy the filter requirements
of being caused by a branch miss AND a cache miss for example. I have asked internally about whether this is a mistake or not.


Thanks
James

On 15/11/2019 02:59, Tan Xiaojun wrote:
> On 2019/11/13 22:47, James Clark wrote:
>> Hi Xiaojun,
>>
>>> I can't reproduce this problem. If the current system doesn't support spe, it shouldn't report an error. I use the latest codes of the mainline:
>>
>> I think the problem is related to the 'type' attribute of the event. To open the SPE PMU the event type on the platform I'm using is '7'. If I change
>> the code like this, the problem is fixed:
>>
>> @@ -914,13 +914,27 @@ void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel)
>> pmu = perf_pmu__find("arm_spe_0");
>> if (pmu) {
>> evsel->pmu_name = pmu->name;
>> - evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>> - evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>> - | SPE_ATTR_PA_ENABLE
>> - | SPE_ATTR_JITTER
>> + evsel->core.attr.type = pmu->type;
>> + evsel->core.attr.config |= SPE_ATTR_TS_ENABLE
>> | SPE_ATTR_BRANCH_FILTER;
>>
>
> Hi, James,
> OK. Thank you for your fix.
>
>> Also do you think jitter should be enabled by default? I thought that it might make the data less precise, so I removed it here.
>
> Since the interval for sampling without "jitter" is fixed (default 1024 on our server), I was worried that not adding it would result in the same result for each record, and some instructions could not be collected each time.
>
> However, after many tests, it is not clear from the results that there is a significant difference between them (enable it or not).
>
> So I am confused, whether to enable it or not.
>
> Thanks.
> Xiaojun.
>
>>
>> -James
>>
>>>
>>> commit f116b96685a046a89c25d4a6ba2da489145c8888 (mainline/master)
>>> Merge: f632bfaa33ed 603d9299da32
>>> Author: Linus Torvalds <[email protected]>
>>> Date: Thu Oct 24 06:13:45 2019 -0400
>>>
>>> Merge tag 'mfd-fixes-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>>
>>> I will go and see why this will be reported.
>>>
>>>>
>>>>
>>>> I would have expected to use the event name that is listed in the SPE documentation for branch misses which is br_mis_pred or br_mis_pred_retired:
>>>>
>>>>     E[7], byte 0 bit [7]
>>>>     Mispredicted. The defined values of this bit are:
>>>>     0 Did not cause correction to the predicted program flow.
>>>>     1 A branch that caused a correction to the predicted program flow.
>>>>
>>>>     If PMUv3 is implemented this Event is required to be implemented consistently with either BR_MIS_PRED or BR_MIS_PRED_RETIRED.
>>>>
>>>
>>> Do you mean that I can add these as new events to perf? If we think of them as new events, what should we do if the user does not add :pp for them?
>>> (Or for these events, users can only add :pp to use them?)
>>>
>>>>
>>>> +       if (!strcmp(perf_env__arch(evlist->env), "arm64")
>>>> +                       && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
>>>> +                       && evsel->core.attr.precise_ip) {
>>>>
>>>> As I mentioned above PERF_COUNT_HW_BRANCH_MISSESdoesn't seem to match up with the actual event counter that is associated with this SPE event (BR_MIS_PRED). The fix for this is probably as simple as adding an OR for the other aliases for branch mispredicts.
>>>
>>> What you mean is that we can filter with spe events(like BR_MIS_PRED) first, and if we have other events that are exactly the same(no more for now), then we can handle them by adding OR in the future?
>>>
>>>>
>>>> +               pmu = perf_pmu__find("arm_spe_0");
>>>> +               if (pmu) {
>>>> +                       evsel->pmu_name = pmu->name;
>>>> +                       evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>>>> +                       evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>>>> +                                               | SPE_ATTR_PA_ENABLE
>>>>
>>>> I wouldn't set physical addresses by default as this requires root. I would leave that to the user if they want to manually configure SPE.
>>>
>>> Yes. You are right. I got a error for this case. I will fix it.
>>>
>>> ------------------
>>> ./perf record -e branch-misses:p ls
>>> Error:
>>> You may not have permission to collect stats.
>>> ...
>>> ------------------
>>>
>>> Thanks.
>>> Xiaojun.
>>>
>>>>
>>>> I have only looked briefly and I will do some more testing.
>>>>
>>>>
>>>> Thanks
>>>> James
>>>>
>>>>
>>>
>>>
>
>

2019-11-18 07:07:34

by Tan Xiaojun

[permalink] [raw]
Subject: Re: [RFC v2 4/4] perf tools: Support "branch-misses:pp" on arm64

On 2019/11/15 19:37, James Clark wrote:
> Hi Xiaojun,
>
> If the difference is not noticeable I think it would be better to leave it disabled. Presumably if the user
> supplies the ":p" argument they are interested in the data being as precise as possible.
>
> If they want to enable jitter, then can always configure the SPE event manually.
>

Hi,++ James,

OK. Agree.

> I have a question about what kind of approach you think we should take for multiple events that are provided with :p.
> For example "perf record -e branch-misses:p -e cache-misses:p ...". In your current implementation this will
> give the error "There may be only one SPE event". I think this is fine for a first implementation. But I wonder if there
> is a way of supporting multiple SPE events?
>
> From the documentation it seems like the filter events are ANDed together:
>
> PMSEVFR_EL1.
> Controls sample filtering by events. The overall filter is the logical AND of these filters. For example, if E[3] and E[5] are both set to 1,
> only samples that have both event 3 (Level 1 unified or data cache refill) and event 5 set (TLB walk) are recorded
>
> Which means that if we kept adding filters for new event types, there would be no events received because they wouldn't satisfy the filter requirements
> of being caused by a branch miss AND a cache miss for example. I have asked internally about whether this is a mistake or not.
>

Yes, this is a problem, and you mentioned that we have to define several new spe events for this, and I am still considering how to add them.(I originally wanted to add them to the spe driver, but this will not avoid the problem of multiple spe events you mentioned above.)

Based on scenarios where no new spe events are added, if the user specifies multiple spe events, I think we need to analyze these events and classify the spe events. If there are multiple, we need to keep only one (but record all spe events name or alias like "branch-misses, cache-misses" in some variables) and disable event_filter. Then "perf report" can create new selectors for these synthetic events and output them.

Thanks.
Xiaojun.

>
> Thanks
> James
>
> On 15/11/2019 02:59, Tan Xiaojun wrote:
>> On 2019/11/13 22:47, James Clark wrote:
>>> Hi Xiaojun,
>>>
>>>> I can't reproduce this problem. If the current system doesn't support spe, it shouldn't report an error. I use the latest codes of the mainline:
>>>
>>> I think the problem is related to the 'type' attribute of the event. To open the SPE PMU the event type on the platform I'm using is '7'. If I change
>>> the code like this, the problem is fixed:
>>>
>>> @@ -914,13 +914,27 @@ void arm_spe_precise_ip_support(struct evlist *evlist, struct evsel *evsel)
>>> pmu = perf_pmu__find("arm_spe_0");
>>> if (pmu) {
>>> evsel->pmu_name = pmu->name;
>>> - evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>>> - evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>>> - | SPE_ATTR_PA_ENABLE
>>> - | SPE_ATTR_JITTER
>>> + evsel->core.attr.type = pmu->type;
>>> + evsel->core.attr.config |= SPE_ATTR_TS_ENABLE
>>> | SPE_ATTR_BRANCH_FILTER;
>>>
>>
>> Hi, James,
>> OK. Thank you for your fix.
>>
>>> Also do you think jitter should be enabled by default? I thought that it might make the data less precise, so I removed it here.
>>
>> Since the interval for sampling without "jitter" is fixed (default 1024 on our server), I was worried that not adding it would result in the same result for each record, and some instructions could not be collected each time.
>>
>> However, after many tests, it is not clear from the results that there is a significant difference between them (enable it or not).
>>
>> So I am confused, whether to enable it or not.
>>
>> Thanks.
>> Xiaojun.
>>
>>>
>>> -James
>>>
>>>>
>>>> commit f116b96685a046a89c25d4a6ba2da489145c8888 (mainline/master)
>>>> Merge: f632bfaa33ed 603d9299da32
>>>> Author: Linus Torvalds <[email protected]>
>>>> Date: Thu Oct 24 06:13:45 2019 -0400
>>>>
>>>> Merge tag 'mfd-fixes-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
>>>>
>>>> I will go and see why this will be reported.
>>>>
>>>>>
>>>>>
>>>>> I would have expected to use the event name that is listed in the SPE documentation for branch misses which is br_mis_pred or br_mis_pred_retired:
>>>>>
>>>>>     E[7], byte 0 bit [7]
>>>>>     Mispredicted. The defined values of this bit are:
>>>>>     0 Did not cause correction to the predicted program flow.
>>>>>     1 A branch that caused a correction to the predicted program flow.
>>>>>
>>>>>     If PMUv3 is implemented this Event is required to be implemented consistently with either BR_MIS_PRED or BR_MIS_PRED_RETIRED.
>>>>>
>>>>
>>>> Do you mean that I can add these as new events to perf? If we think of them as new events, what should we do if the user does not add :pp for them?
>>>> (Or for these events, users can only add :pp to use them?)
>>>>
>>>>>
>>>>> +       if (!strcmp(perf_env__arch(evlist->env), "arm64")
>>>>> +                       && evsel->core.attr.config == PERF_COUNT_HW_BRANCH_MISSES
>>>>> +                       && evsel->core.attr.precise_ip) {
>>>>>
>>>>> As I mentioned above PERF_COUNT_HW_BRANCH_MISSESdoesn't seem to match up with the actual event counter that is associated with this SPE event (BR_MIS_PRED). The fix for this is probably as simple as adding an OR for the other aliases for branch mispredicts.
>>>>
>>>> What you mean is that we can filter with spe events(like BR_MIS_PRED) first, and if we have other events that are exactly the same(no more for now), then we can handle them by adding OR in the future?
>>>>
>>>>>
>>>>> +               pmu = perf_pmu__find("arm_spe_0");
>>>>> +               if (pmu) {
>>>>> +                       evsel->pmu_name = pmu->name;
>>>>> +                       evsel->core.attr.type = PERF_RECORD_AUXTRACE;
>>>>> +                       evsel->core.attr.config = SPE_ATTR_TS_ENABLE
>>>>> +                                               | SPE_ATTR_PA_ENABLE
>>>>>
>>>>> I wouldn't set physical addresses by default as this requires root. I would leave that to the user if they want to manually configure SPE.
>>>>
>>>> Yes. You are right. I got a error for this case. I will fix it.
>>>>
>>>> ------------------
>>>> ./perf record -e branch-misses:p ls
>>>> Error:
>>>> You may not have permission to collect stats.
>>>> ...
>>>> ------------------
>>>>
>>>> Thanks.
>>>> Xiaojun.
>>>>
>>>>>
>>>>> I have only looked briefly and I will do some more testing.
>>>>>
>>>>>
>>>>> Thanks
>>>>> James
>>>>>
>>>>>
>>>>
>>>>
>>
>>