2023-06-15 05:19:04

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v2 0/3] perf mem amd: Fix few logic bugs

Recent PMU refactoring changes[1] introduced a notion of core vs other
PMUs and made perf mem/c2c code depend only on core PMUs, which is
logically wrong for AMD as perf mem/c2c on AMD depends on IBS OP PMU,
not the core PMU. Although user visible perf mem/c2c functionality is
still working fine, internal code logic is wrong. Fix those.

[1] https://lore.kernel.org/r/[email protected]

v1: https://lore.kernel.org/r/[email protected]
v1->v2:
- Patch #2 of last version is already picked up by Arnaldo. So skip it.
- Scan all PMUs unconditionally in perf mem code instead of making it
conditional on arch.

Ravi Bangoria (3):
perf pmus: Describe semantics of 'core_pmus' and 'other_pmus'
perf mem amd: Fix perf_pmus__num_mem_pmus()
perf mem: Scan all PMUs instead of just core ones

tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
tools/perf/util/mem-events.c | 13 +++++++++----
tools/perf/util/pmus.c | 17 ++++++++++++++++-
3 files changed, 37 insertions(+), 5 deletions(-)

--
2.40.1



2023-06-15 05:45:26

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v2 1/3] perf pmus: Describe semantics of 'core_pmus' and 'other_pmus'

Notion of 'core_pmus' and 'other_pmus' are independent of hw core and
uncore pmus. For example, AMD IBS PMUs are present in each SMT-thread
but they belongs to 'other_pmus'. Add a comment describing what these
list contains and how they are treated.

Signed-off-by: Ravi Bangoria <[email protected]>
---
tools/perf/util/pmus.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index e1d0a93147e5..8c50ab8894b7 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -12,6 +12,21 @@
#include "pmu.h"
#include "print-events.h"

+/*
+ * core_pmus: A PMU belongs to core_pmus if it's name is "cpu" or it's sysfs
+ * directory contains "cpus" file. All PMUs belonging to core_pmus
+ * must have pmu->is_core=1. If there are more than one PMU in
+ * this list, perf interprets it as a heterogeneous platform.
+ * (FWIW, certain ARM platforms having heterogeneous cores uses
+ * homogeneous PMU, and thus they are treated as homogeneous
+ * platform by perf because core_pmus will have only one entry)
+ * other_pmus: All other PMUs which are not part of core_pmus list. It doesn't
+ * matter whether PMU is present per SMT-thread or outside of the
+ * core in the hw. For e.g., an instance of AMD ibs_fetch// and
+ * ibs_op// PMUs is present in each hw SMT thread, however they
+ * are captured under other_pmus. PMUs belonging to other_pmus
+ * must have pmu->is_core=0 but pmu->is_uncore could be 0 or 1.
+ */
static LIST_HEAD(core_pmus);
static LIST_HEAD(other_pmus);
static bool read_sysfs_core_pmus;
--
2.40.1


2023-06-15 05:46:17

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v2 3/3] perf mem: Scan all PMUs instead of just core ones

Scanning only core PMUs is not sufficient on platforms like AMD since
perf mem on AMD uses IBS OP PMU, which is independent of core PMU.
Scan all PMUs instead of just core PMUs. There should be negligible
performance overhead because of scanning all PMUs, so we should be okay.

Signed-off-by: Ravi Bangoria <[email protected]>
---
tools/perf/util/mem-events.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index be15aadb6b14..c07fe3a90722 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -130,7 +130,12 @@ int perf_mem_events__init(void)
if (!e->tag)
continue;

- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ /*
+ * Scan all PMUs not just core ones, since perf mem/c2c on
+ * platforms like AMD uses IBS OP PMU which is independent
+ * of core PMU.
+ */
+ while ((pmu = perf_pmus__scan(pmu)) != NULL) {
scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name, pmu->name);
e->supported |= perf_mem_event__supported(mnt, sysfs_name);
}
@@ -165,7 +170,7 @@ static void perf_mem_events__print_unsupport_hybrid(struct perf_mem_event *e,
char sysfs_name[100];
struct perf_pmu *pmu = NULL;

- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ while ((pmu = perf_pmus__scan(pmu)) != NULL) {
scnprintf(sysfs_name, sizeof(sysfs_name), e->sysfs_name,
pmu->name);
if (!perf_mem_event__supported(mnt, sysfs_name)) {
@@ -188,7 +193,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
if (!e->record)
continue;

- if (perf_pmus__num_core_pmus() == 1) {
+ if (perf_pmus__num_mem_pmus() == 1) {
if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
perf_mem_events__name(j, NULL));
@@ -203,7 +208,7 @@ int perf_mem_events__record_args(const char **rec_argv, int *argv_nr,
return -1;
}

- while ((pmu = perf_pmus__scan_core(pmu)) != NULL) {
+ while ((pmu = perf_pmus__scan(pmu)) != NULL) {
rec_argv[i++] = "-e";
s = perf_mem_events__name(j, pmu->name);
if (s) {
--
2.40.1


2023-06-15 05:59:38

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v2 2/3] perf mem amd: Fix perf_pmus__num_mem_pmus()

perf mem/c2c on AMD internally uses IBS OP PMU, not the core PMU. Also,
AMD platforms does not have heterogeneous PMUs.

Signed-off-by: Ravi Bangoria <[email protected]>
---
tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
tools/perf/util/pmus.c | 2 +-
2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index 3c0de3370d7e..f3534a791e79 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -14,6 +14,8 @@
#include "../../../util/intel-bts.h"
#include "../../../util/pmu.h"
#include "../../../util/fncache.h"
+#include "../../../util/pmus.h"
+#include "env.h"

struct pmu_alias {
char *name;
@@ -168,3 +170,13 @@ char *pmu_find_alias_name(const char *name)

return __pmu_find_alias_name(name);
}
+
+int perf_pmus__num_mem_pmus(void)
+{
+ /* AMD uses IBS OP pmu for perf mem/c2c */
+ if (x86__is_amd_cpu())
+ return 1;
+
+ /* Intel uses core pmus for perf mem/c2c */
+ return perf_pmus__num_core_pmus();
+}
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 8c50ab8894b7..a2032c1b7644 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -242,7 +242,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
return NULL;
}

-int perf_pmus__num_mem_pmus(void)
+int __weak perf_pmus__num_mem_pmus(void)
{
/* All core PMUs are for mem events. */
return perf_pmus__num_core_pmus();
--
2.40.1


2023-06-15 06:33:29

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] perf mem amd: Fix perf_pmus__num_mem_pmus()

On Wed, Jun 14, 2023 at 10:18 PM Ravi Bangoria <[email protected]> wrote:
>
> perf mem/c2c on AMD internally uses IBS OP PMU, not the core PMU. Also,
> AMD platforms does not have heterogeneous PMUs.
>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
> tools/perf/util/pmus.c | 2 +-
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> index 3c0de3370d7e..f3534a791e79 100644
> --- a/tools/perf/arch/x86/util/pmu.c
> +++ b/tools/perf/arch/x86/util/pmu.c
> @@ -14,6 +14,8 @@
> #include "../../../util/intel-bts.h"
> #include "../../../util/pmu.h"
> #include "../../../util/fncache.h"
> +#include "../../../util/pmus.h"
> +#include "env.h"
>
> struct pmu_alias {
> char *name;
> @@ -168,3 +170,13 @@ char *pmu_find_alias_name(const char *name)
>
> return __pmu_find_alias_name(name);
> }
> +
> +int perf_pmus__num_mem_pmus(void)
> +{
> + /* AMD uses IBS OP pmu for perf mem/c2c */

nit: could we just add:
AMD uses IBS OP pmu and not a core PMU for perf mem/c2c

Thanks,
Ian

> + if (x86__is_amd_cpu())
> + return 1;
> +
> + /* Intel uses core pmus for perf mem/c2c */
> + return perf_pmus__num_core_pmus();
> +}
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index 8c50ab8894b7..a2032c1b7644 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -242,7 +242,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
> return NULL;
> }
>
> -int perf_pmus__num_mem_pmus(void)
> +int __weak perf_pmus__num_mem_pmus(void)
> {
> /* All core PMUs are for mem events. */
> return perf_pmus__num_core_pmus();
> --
> 2.40.1
>

2023-06-15 06:39:01

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] perf mem amd: Fix perf_pmus__num_mem_pmus()

>> +int perf_pmus__num_mem_pmus(void)
>> +{
>> + /* AMD uses IBS OP pmu for perf mem/c2c */
>
> nit: could we just add:
> AMD uses IBS OP pmu and not a core PMU for perf mem/c2c

Sure.

2023-06-15 06:39:49

by Ravi Bangoria

[permalink] [raw]
Subject: [PATCH v2.1 2/3] perf mem amd: Fix perf_pmus__num_mem_pmus()

perf mem/c2c on AMD internally uses IBS OP PMU, not the core PMU. Also,
AMD platforms does not have heterogeneous PMUs.

Signed-off-by: Ravi Bangoria <[email protected]>
---
tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
tools/perf/util/pmus.c | 2 +-
2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
index 3c0de3370d7e..65d8cdff4d5f 100644
--- a/tools/perf/arch/x86/util/pmu.c
+++ b/tools/perf/arch/x86/util/pmu.c
@@ -14,6 +14,8 @@
#include "../../../util/intel-bts.h"
#include "../../../util/pmu.h"
#include "../../../util/fncache.h"
+#include "../../../util/pmus.h"
+#include "env.h"

struct pmu_alias {
char *name;
@@ -168,3 +170,13 @@ char *pmu_find_alias_name(const char *name)

return __pmu_find_alias_name(name);
}
+
+int perf_pmus__num_mem_pmus(void)
+{
+ /* AMD uses IBS OP pmu and not a core PMU for perf mem/c2c */
+ if (x86__is_amd_cpu())
+ return 1;
+
+ /* Intel uses core pmus for perf mem/c2c */
+ return perf_pmus__num_core_pmus();
+}
diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
index 8c50ab8894b7..a2032c1b7644 100644
--- a/tools/perf/util/pmus.c
+++ b/tools/perf/util/pmus.c
@@ -242,7 +242,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
return NULL;
}

-int perf_pmus__num_mem_pmus(void)
+int __weak perf_pmus__num_mem_pmus(void)
{
/* All core PMUs are for mem events. */
return perf_pmus__num_core_pmus();
--
2.40.1


2023-06-15 06:41:33

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] perf mem amd: Fix few logic bugs

On Wed, Jun 14, 2023 at 10:17 PM Ravi Bangoria <[email protected]> wrote:
>
> Recent PMU refactoring changes[1] introduced a notion of core vs other
> PMUs and made perf mem/c2c code depend only on core PMUs, which is
> logically wrong for AMD as perf mem/c2c on AMD depends on IBS OP PMU,
> not the core PMU. Although user visible perf mem/c2c functionality is
> still working fine, internal code logic is wrong. Fix those.
>
> [1] https://lore.kernel.org/r/[email protected]
>
> v1: https://lore.kernel.org/r/[email protected]
> v1->v2:
> - Patch #2 of last version is already picked up by Arnaldo. So skip it.
> - Scan all PMUs unconditionally in perf mem code instead of making it
> conditional on arch.
>
> Ravi Bangoria (3):
> perf pmus: Describe semantics of 'core_pmus' and 'other_pmus'
> perf mem amd: Fix perf_pmus__num_mem_pmus()
> perf mem: Scan all PMUs instead of just core ones

For the series:
Reviewed-by: Ian Rogers <[email protected]>
just a nit in a comment on the 2nd patch.

Thanks,
Ian

>
> tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
> tools/perf/util/mem-events.c | 13 +++++++++----
> tools/perf/util/pmus.c | 17 ++++++++++++++++-
> 3 files changed, 37 insertions(+), 5 deletions(-)
>
> --
> 2.40.1
>

2023-06-16 14:22:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2.1 2/3] perf mem amd: Fix perf_pmus__num_mem_pmus()

Em Thu, Jun 15, 2023 at 11:42:38AM +0530, Ravi Bangoria escreveu:
> perf mem/c2c on AMD internally uses IBS OP PMU, not the core PMU. Also,
> AMD platforms does not have heterogeneous PMUs.

Thanks, applied the series.

- Arnaldo

> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> tools/perf/arch/x86/util/pmu.c | 12 ++++++++++++
> tools/perf/util/pmus.c | 2 +-
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c
> index 3c0de3370d7e..65d8cdff4d5f 100644
> --- a/tools/perf/arch/x86/util/pmu.c
> +++ b/tools/perf/arch/x86/util/pmu.c
> @@ -14,6 +14,8 @@
> #include "../../../util/intel-bts.h"
> #include "../../../util/pmu.h"
> #include "../../../util/fncache.h"
> +#include "../../../util/pmus.h"
> +#include "env.h"
>
> struct pmu_alias {
> char *name;
> @@ -168,3 +170,13 @@ char *pmu_find_alias_name(const char *name)
>
> return __pmu_find_alias_name(name);
> }
> +
> +int perf_pmus__num_mem_pmus(void)
> +{
> + /* AMD uses IBS OP pmu and not a core PMU for perf mem/c2c */
> + if (x86__is_amd_cpu())
> + return 1;
> +
> + /* Intel uses core pmus for perf mem/c2c */
> + return perf_pmus__num_core_pmus();
> +}
> diff --git a/tools/perf/util/pmus.c b/tools/perf/util/pmus.c
> index 8c50ab8894b7..a2032c1b7644 100644
> --- a/tools/perf/util/pmus.c
> +++ b/tools/perf/util/pmus.c
> @@ -242,7 +242,7 @@ const struct perf_pmu *perf_pmus__pmu_for_pmu_filter(const char *str)
> return NULL;
> }
>
> -int perf_pmus__num_mem_pmus(void)
> +int __weak perf_pmus__num_mem_pmus(void)
> {
> /* All core PMUs are for mem events. */
> return perf_pmus__num_core_pmus();
> --
> 2.40.1
>

--

- Arnaldo