2021-02-08 17:51:45

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 00/49] Add Alder Lake support for perf

From: Kan Liang <[email protected]>

(The V1 patchset is a complete patchset for the Alder Lake support on
the Linux perf. It includes both kernel patches (1-25) and the user
space patches (26-49). It tries to give the maintainers/reviewers an
overall picture of the ADL enabling patches. The number of the patches
are huge. Sorry for it. For future versions, the patchset will be
divided into the kernel patch series and the userspace patch series.
They can be reviewed separately.)

Alder Lake uses a hybrid architecture utilizing Golden Cove cores
and Gracemont cores. On such architectures, all CPUs support the same,
homogeneous and symmetric, instruction set. Also, CPUID enumerate
the same features for all CPUs. There may be model-specific differences,
such as those addressed in this patchset.

The first two patches enumerate the hybrid CPU feature bit and save the
CPU type in a new field x86_cpu_type in struct cpuinfo_x86 for the
following patches. They were posted previously[1] but not merged.
Compared with the initial submission, they address the below two
concerns[2][3],
- Provide a good use case, PMU.
- Clarify what Intel Hybrid Technology is and is not.

The PMU capabilities for Golden Cove core and Gracemont core are not the
same. The key differences include the number of counters, events, perf
metrics feature, and PEBS-via-PT feature. A dedicated hybrid PMU has to
be registered for each of them. However, the current perf X86 assumes
that there is only one CPU PMU. To handle the hybrid PMUs, the patchset
- Introduce a new struct x86_hybrid_pmu to save the unique capabilities
from different PMUs. It's part of the global x86_pmu. The architecture
capabilities, which are available for all PMUs, are still saved in
the global x86_pmu. I once considered dynamically create dedicated
x86_pmu and pmu for each hybrid PMU. If so, they have to be changed to
pointers. Since they are used everywhere, the changes could be huge
and complex. Also, most of the PMU capabilities are the same between
hybrid PMUs. Duplicated data in the big x86_pmu structure will be
saved many times. So the dynamic way was dropped.
- The hybrid PMU registration has been moved to the cpu_starting(),
because only boot CPU is available when invoking the
init_hw_perf_events().
- Hybrid PMUs have different events and formats. Add new structures and
helpers for events attribute and format attribute which take the PMU
type into account.
- Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
PERF_TYPE_HW_CACHE_PMU to facilitate user space tools

The uncore, MSR and cstate are the same between hybrid CPUs.
Don't need to register hybrid PMUs for them.

The generic code kernel/events/core.c is not hybrid friendly either,
especially for the per-task monitoring. Peter once proposed a
patchset[4], but it hasn't been merged. This patchset doesn't intend to
improve the generic code (which can be improved later separately). It
still uses the capability PERF_PMU_CAP_HETEROGENEOUS_CPUS for each
hybrid PMUs. For per-task and system-wide monitoring, user space tools
have to create events on all available hybrid PMUs. The events which are
from different hybrid PMUs cannot be included in the same group.

[1]. https://lore.kernel.org/lkml/[email protected]/
[2]. https://lore.kernel.org/lkml/[email protected]/
[3]. https://lore.kernel.org/lkml/[email protected]/
[4]. https://lkml.kernel.org/r/[email protected]/

Jin Yao (24):
perf jevents: Support unit value "cpu_core" and "cpu_atom"
perf util: Save pmu name to struct perf_pmu_alias
perf pmu: Save detected hybrid pmus to a global pmu list
perf pmu: Add hybrid helper functions
perf list: Support --cputype option to list hybrid pmu events
perf stat: Hybrid evsel uses its own cpus
perf header: Support HYBRID_TOPOLOGY feature
perf header: Support hybrid CPU_PMU_CAPS
tools headers uapi: Update tools's copy of linux/perf_event.h
perf parse-events: Create two hybrid hardware events
perf parse-events: Create two hybrid cache events
perf parse-events: Support hardware events inside PMU
perf list: Display pmu prefix for partially supported hybrid cache
events
perf parse-events: Support hybrid raw events
perf stat: Support --cputype option for hybrid events
perf stat: Support metrics with hybrid events
perf evlist: Create two hybrid 'cycles' events by default
perf stat: Add default hybrid events
perf stat: Uniquify hybrid event name
perf stat: Merge event counts from all hybrid PMUs
perf stat: Filter out unmatched aggregation for hybrid event
perf evlist: Warn as events from different hybrid PMUs in a group
perf Documentation: Document intel-hybrid support
perf evsel: Adjust hybrid event and global event mixed group

Kan Liang (22):
perf/x86/intel: Hybrid PMU support for perf capabilities
perf/x86: Hybrid PMU support for intel_ctrl
perf/x86: Hybrid PMU support for counters
perf/x86: Hybrid PMU support for unconstrained
perf/x86: Hybrid PMU support for hardware cache event
perf/x86: Hybrid PMU support for event constraints
perf/x86: Hybrid PMU support for extra_regs
perf/x86/intel: Factor out intel_pmu_check_num_counters
perf/x86/intel: Factor out intel_pmu_check_event_constraints
perf/x86/intel: Factor out intel_pmu_check_extra_regs
perf/x86: Expose check_hw_exists
perf/x86: Remove temporary pmu assignment in event_init
perf/x86: Factor out x86_pmu_show_pmu_cap
perf/x86: Register hybrid PMUs
perf/x86: Add structures for the attributes of Hybrid PMUs
perf/x86/intel: Add attr_update for Hybrid PMUs
perf/x86: Support filter_match callback
perf/x86/intel: Add Alder Lake Hybrid support
perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
perf/x86/intel/uncore: Add Alder Lake support
perf/x86/msr: Add Alder Lake CPU support
perf/x86/cstate: Add Alder Lake CPU support

Ricardo Neri (2):
x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
x86/cpu: Describe hybrid CPUs in cpuinfo_x86

Zhang Rui (1):
perf/x86/rapl: Add support for Intel Alder Lake

arch/x86/events/core.c | 286 ++++++++++---
arch/x86/events/intel/core.c | 685 ++++++++++++++++++++++++++----
arch/x86/events/intel/cstate.c | 39 +-
arch/x86/events/intel/ds.c | 28 +-
arch/x86/events/intel/uncore.c | 7 +
arch/x86/events/intel/uncore.h | 1 +
arch/x86/events/intel/uncore_snb.c | 131 ++++++
arch/x86/events/msr.c | 2 +
arch/x86/events/perf_event.h | 117 ++++-
arch/x86/events/rapl.c | 2 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 2 +
arch/x86/include/asm/processor.h | 13 +
arch/x86/kernel/cpu/common.c | 3 +
include/linux/perf_event.h | 12 +
include/uapi/linux/perf_event.h | 26 ++
kernel/events/core.c | 14 +-
tools/include/uapi/linux/perf_event.h | 26 ++
tools/perf/Documentation/intel-hybrid.txt | 335 +++++++++++++++
tools/perf/Documentation/perf-list.txt | 4 +
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/Documentation/perf-stat.txt | 13 +
tools/perf/builtin-list.c | 42 +-
tools/perf/builtin-record.c | 3 +
tools/perf/builtin-stat.c | 94 +++-
tools/perf/pmu-events/jevents.c | 2 +
tools/perf/util/cputopo.c | 80 ++++
tools/perf/util/cputopo.h | 13 +
tools/perf/util/env.c | 12 +
tools/perf/util/env.h | 18 +-
tools/perf/util/evlist.c | 148 ++++++-
tools/perf/util/evlist.h | 7 +
tools/perf/util/evsel.c | 111 ++++-
tools/perf/util/evsel.h | 10 +-
tools/perf/util/header.c | 267 +++++++++++-
tools/perf/util/header.h | 1 +
tools/perf/util/metricgroup.c | 226 +++++++++-
tools/perf/util/metricgroup.h | 2 +-
tools/perf/util/parse-events.c | 405 +++++++++++++++++-
tools/perf/util/parse-events.h | 10 +-
tools/perf/util/parse-events.y | 21 +-
tools/perf/util/pmu.c | 120 +++++-
tools/perf/util/pmu.h | 24 +-
tools/perf/util/stat-display.c | 28 +-
tools/perf/util/stat.h | 2 +
45 files changed, 3106 insertions(+), 288 deletions(-)
create mode 100644 tools/perf/Documentation/intel-hybrid.txt

--
2.7.4


2021-02-08 17:53:04

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 01/49] x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit

From: Ricardo Neri <[email protected]>

Add feature enumeration to identify a processor with Intel Hybrid
Technology: one in which CPUs of more than one type are the same package.
On a hybrid processor, all CPUs support the same homogeneous (i.e.,
symmetric) instruction set. All CPUs enumerate the same features in CPUID.
Thus, software (user space and kernel) can run and migrate to any CPU in
the system as well as utilize any of the enumerated features without any
change or special provisions. The main difference among CPUs in a hybrid
processor are power and performance properties.

Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Tony Luck <[email protected]>
Reviewed-by: Len Brown <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Ricardo Neri <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 84b8878..2270df3 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -374,6 +374,7 @@
#define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */
#define X86_FEATURE_TSX_FORCE_ABORT (18*32+13) /* "" TSX_FORCE_ABORT */
#define X86_FEATURE_SERIALIZE (18*32+14) /* SERIALIZE instruction */
+#define X86_FEATURE_HYBRID_CPU (18*32+15) /* This part has CPUs of more than one type */
#define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */
--
2.7.4

2021-02-08 17:56:45

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 38/49] perf list: Display pmu prefix for partially supported hybrid cache events

From: Jin Yao <[email protected]>

Part of hardware cache events are only available on one cpu pmu.
For example, 'L1-dcache-load-misses' is only available on cpu_core.
perf list should clearly report this info.

root@otcpl-adl-s-2:~# ./perf list

Before:
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-stores [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
L1-icache-loads [Hardware cache event]
LLC-load-misses [Hardware cache event]
LLC-loads [Hardware cache event]
LLC-store-misses [Hardware cache event]
LLC-stores [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
dTLB-store-misses [Hardware cache event]
dTLB-stores [Hardware cache event]
iTLB-load-misses [Hardware cache event]
node-load-misses [Hardware cache event]
node-loads [Hardware cache event]
node-store-misses [Hardware cache event]
node-stores [Hardware cache event]

After:
L1-dcache-loads [Hardware cache event]
L1-dcache-stores [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
LLC-load-misses [Hardware cache event]
LLC-loads [Hardware cache event]
LLC-store-misses [Hardware cache event]
LLC-stores [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
cpu_atom/L1-icache-loads/ [Hardware cache event]
cpu_core/L1-dcache-load-misses/ [Hardware cache event]
cpu_core/node-load-misses/ [Hardware cache event]
cpu_core/node-loads/ [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
dTLB-store-misses [Hardware cache event]
dTLB-stores [Hardware cache event]
iTLB-load-misses [Hardware cache event]

Now we can clearly see 'L1-dcache-load-misses' is only available
on cpu_core.

If without pmu prefix, it indicates the event is available on both
cpu_core and cpu_atom.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/parse-events.c | 79 +++++++++++++++++++++++++++++++++++++-----
tools/perf/util/pmu.c | 11 ++++++
tools/perf/util/pmu.h | 2 ++
3 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index bba7db3..ddf6f79 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2809,7 +2809,7 @@ int is_valid_tracepoint(const char *event_string)
return 0;
}

-static bool is_event_supported(u8 type, unsigned config)
+static bool is_event_supported(u8 type, u64 config)
{
bool ret = true;
int open_return;
@@ -2929,10 +2929,21 @@ void print_sdt_events(const char *subsys_glob, const char *event_glob,

int print_hwcache_events(const char *event_glob, bool name_only)
{
- unsigned int type, op, i, evt_i = 0, evt_num = 0;
- char name[64];
- char **evt_list = NULL;
+ unsigned int type, op, i, evt_i = 0, evt_num = 0, npmus;
+ char name[64], new_name[128];
+ char **evt_list = NULL, **evt_pmus = NULL;
bool evt_num_known = false;
+ struct perf_pmu *pmu = NULL;
+
+ if (!perf_pmu__hybrid_exist())
+ perf_pmu__scan(NULL);
+
+ npmus = perf_pmu__hybrid_npmus();
+ if (npmus) {
+ evt_pmus = zalloc(sizeof(char *) * npmus);
+ if (!evt_pmus)
+ goto out_enomem;
+ }

restart:
if (evt_num_known) {
@@ -2948,20 +2959,61 @@ int print_hwcache_events(const char *event_glob, bool name_only)
continue;

for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) {
+ unsigned int hybrid_supported = 0, j;
+ bool supported;
+
__evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name));
if (event_glob != NULL && !strglobmatch(name, event_glob))
continue;

- if (!is_event_supported(PERF_TYPE_HW_CACHE,
- type | (op << 8) | (i << 16)))
- continue;
+ if (!perf_pmu__hybrid_exist()) {
+ if (!is_event_supported(PERF_TYPE_HW_CACHE,
+ type | (op << 8) | (i << 16))) {
+ continue;
+ }
+ } else {
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ if (!evt_num_known) {
+ evt_num++;
+ continue;
+ }
+
+ supported = is_event_supported(
+ PERF_TYPE_HW_CACHE_PMU,
+ type | (op << 8) | (i << 16) |
+ ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT));
+ if (supported) {
+ snprintf(new_name, sizeof(new_name), "%s/%s/",
+ pmu->name, name);
+ evt_pmus[hybrid_supported] = strdup(new_name);
+ hybrid_supported++;
+ }
+ }
+
+ if (hybrid_supported == 0)
+ continue;
+ }

if (!evt_num_known) {
evt_num++;
continue;
}

- evt_list[evt_i] = strdup(name);
+ if ((hybrid_supported == 0) ||
+ (hybrid_supported == npmus)) {
+ evt_list[evt_i] = strdup(name);
+ if (npmus > 0) {
+ for (j = 0; j < npmus; j++)
+ zfree(&evt_pmus[j]);
+ }
+ } else {
+ for (j = 0; j < hybrid_supported; j++) {
+ evt_list[evt_i++] = evt_pmus[j];
+ evt_pmus[j] = NULL;
+ }
+ continue;
+ }
+
if (evt_list[evt_i] == NULL)
goto out_enomem;
evt_i++;
@@ -2973,6 +3025,13 @@ int print_hwcache_events(const char *event_glob, bool name_only)
evt_num_known = true;
goto restart;
}
+
+ for (evt_i = 0; evt_i < evt_num; evt_i++) {
+ if (!evt_list[evt_i])
+ break;
+ }
+
+ evt_num = evt_i;
qsort(evt_list, evt_num, sizeof(char *), cmp_string);
evt_i = 0;
while (evt_i < evt_num) {
@@ -2991,6 +3050,10 @@ int print_hwcache_events(const char *event_glob, bool name_only)
for (evt_i = 0; evt_i < evt_num; evt_i++)
zfree(&evt_list[evt_i]);
zfree(&evt_list);
+
+ for (evt_i = 0; evt_i < npmus; evt_i++)
+ zfree(&evt_pmus[evt_i]);
+ zfree(&evt_pmus);
return evt_num;

out_enomem:
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index ca2fc67..5ebb0da 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1901,3 +1901,14 @@ char *perf_pmu__hybrid_type_to_pmu(const char *type)
free(pmu_name);
return NULL;;
}
+
+int perf_pmu__hybrid_npmus(void)
+{
+ struct perf_pmu *pmu;
+ int n = 0;
+
+ perf_pmu__for_each_hybrid_pmus(pmu)
+ n++;
+
+ return n;
+}
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index ccffc05..4bd7473 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -143,4 +143,6 @@ static inline bool perf_pmu__hybrid_exist(void)
return !list_empty(&perf_pmu__hybrid_pmus);
}

+int perf_pmu__hybrid_npmus(void);
+
#endif /* __PMU_H */
--
2.7.4

2021-02-08 17:57:50

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 04/49] perf/x86: Hybrid PMU support for intel_ctrl

From: Kan Liang <[email protected]>

The intel_ctrl is the event mask of a PMU. The PMU counter information
may be different among hybrid PMUs, each hybrid PMU should use its own
intel_ctrl.

When handling a certain hybrid PMU, apply the intel_ctrl from the
corresponding hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 4 ++--
arch/x86/events/intel/core.c | 14 +++++++++-----
arch/x86/events/perf_event.h | 7 +++++--
3 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 334553f..170acbf 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -256,7 +256,7 @@ static bool check_hw_exists(void)
if (ret)
goto msr_fail;
for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
- if (fixed_counter_disabled(i))
+ if (fixed_counter_disabled(i, NULL))
continue;
if (val & (0x03 << i*4)) {
bios_fail = 1;
@@ -1535,7 +1535,7 @@ void perf_event_print_debug(void)
cpu, idx, prev_left);
}
for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
- if (fixed_counter_disabled(idx))
+ if (fixed_counter_disabled(idx, cpuc))
continue;
rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, pmc_count);

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 4d026f6..1b9563c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2153,10 +2153,11 @@ static void intel_pmu_disable_all(void)
static void __intel_pmu_enable_all(int added, bool pmi)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u64 intel_ctrl = X86_HYBRID_READ_FROM_CPUC(intel_ctrl, cpuc);

intel_pmu_lbr_enable_all(pmi);
wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL,
- x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask);
+ intel_ctrl & ~cpuc->intel_ctrl_guest_mask);

if (test_bit(INTEL_PMC_IDX_FIXED_BTS, cpuc->active_mask)) {
struct perf_event *event =
@@ -2709,6 +2710,7 @@ int intel_pmu_save_and_restart(struct perf_event *event)
static void intel_pmu_reset(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
unsigned long flags;
int idx;

@@ -2724,7 +2726,7 @@ static void intel_pmu_reset(void)
wrmsrl_safe(x86_pmu_event_addr(idx), 0ull);
}
for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
- if (fixed_counter_disabled(idx))
+ if (fixed_counter_disabled(idx, cpuc))
continue;
wrmsrl_safe(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, 0ull);
}
@@ -2753,6 +2755,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int bit;
int handled = 0;
+ u64 intel_ctrl = X86_HYBRID_READ_FROM_CPUC(intel_ctrl, cpuc);

inc_irq_stat(apic_perf_irqs);

@@ -2798,7 +2801,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)

handled++;
x86_pmu.drain_pebs(regs, &data);
- status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
+ status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;

/*
* PMI throttle may be triggered, which stops the PEBS event.
@@ -3808,10 +3811,11 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
+ u64 intel_ctrl = X86_HYBRID_READ_FROM_CPUC(intel_ctrl, cpuc);

arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
- arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
- arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
+ arr[0].host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
+ arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
if (x86_pmu.flags & PMU_FL_PEBS_ALL)
arr[0].guest &= ~cpuc->pebs_enabled;
else
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index a53d4dd..b939784 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -647,6 +647,7 @@ enum x86_hybrid_pmu_type_idx {
struct x86_hybrid_pmu {
struct pmu pmu;
union perf_capabilities intel_cap;
+ u64 intel_ctrl;
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
@@ -1106,9 +1107,11 @@ ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);

-static inline bool fixed_counter_disabled(int i)
+static inline bool fixed_counter_disabled(int i, struct cpu_hw_events *cpuc)
{
- return !(x86_pmu.intel_ctrl >> (i + INTEL_PMC_IDX_FIXED));
+ u64 intel_ctrl = X86_HYBRID_READ_FROM_CPUC(intel_ctrl, cpuc);
+
+ return !(intel_ctrl >> (i + INTEL_PMC_IDX_FIXED));
}

#ifdef CONFIG_CPU_SUP_AMD
--
2.7.4

2021-02-08 17:58:52

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 28/49] perf pmu: Save detected hybrid pmus to a global pmu list

From: Jin Yao <[email protected]>

We identify the cpu_core pmu and cpu_atom pmu by explicitly
checking following files:

For cpu_core, check:
"/sys/bus/event_source/devices/cpu_core/cpus"

For cpu_atom, check:
"/sys/bus/event_source/devices/cpu_atom/cpus"

If the 'cpus' file exists, the pmu exists.

But in order not to hardcode the "cpu_core" and "cpu_atom",
and make the code generic, if the path
"/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid
pmu exists. All the detected hybrid pmus are linked to a
global list 'perf_pmu__hybrid_pmus' and then next we just need
to iterate the list by using perf_pmu__for_each_hybrid_pmus.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/pmu.c | 21 +++++++++++++++++++++
tools/perf/util/pmu.h | 7 +++++++
2 files changed, 28 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 0c25457..e97b121 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -27,6 +27,7 @@
#include "fncache.h"

struct perf_pmu perf_pmu__fake;
+LIST_HEAD(perf_pmu__hybrid_pmus);

struct perf_pmu_format {
char *name;
@@ -633,11 +634,27 @@ static struct perf_cpu_map *pmu_cpumask(const char *name)
return NULL;
}

+static bool pmu_is_hybrid(const char *name)
+{
+ char path[PATH_MAX];
+ const char *sysfs;
+
+ if (strncmp(name, "cpu_", 4))
+ return false;
+
+ sysfs = sysfs__mountpoint();
+ snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, name);
+ return file_available(path);
+}
+
static bool pmu_is_uncore(const char *name)
{
char path[PATH_MAX];
const char *sysfs;

+ if (pmu_is_hybrid(name))
+ return false;
+
sysfs = sysfs__mountpoint();
snprintf(path, PATH_MAX, CPUS_TEMPLATE_UNCORE, sysfs, name);
return file_available(path);
@@ -951,6 +968,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
pmu->is_uncore = pmu_is_uncore(name);
if (pmu->is_uncore)
pmu->id = pmu_id(name);
+ pmu->is_hybrid = pmu_is_hybrid(name);
pmu->max_precise = pmu_max_precise(name);
pmu_add_cpu_aliases(&aliases, pmu);
pmu_add_sys_aliases(&aliases, pmu);
@@ -962,6 +980,9 @@ static struct perf_pmu *pmu_lookup(const char *name)
list_splice(&aliases, &pmu->aliases);
list_add_tail(&pmu->list, &pmus);

+ if (pmu->is_hybrid)
+ list_add_tail(&pmu->hybrid_list, &perf_pmu__hybrid_pmus);
+
pmu->default_config = perf_pmu__get_default_config(pmu);

return pmu;
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 0e724d5..99bdb5d 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -5,6 +5,7 @@
#include <linux/bitmap.h>
#include <linux/compiler.h>
#include <linux/perf_event.h>
+#include <linux/list.h>
#include <stdbool.h>
#include "parse-events.h"
#include "pmu-events/pmu-events.h"
@@ -34,6 +35,7 @@ struct perf_pmu {
__u32 type;
bool selectable;
bool is_uncore;
+ bool is_hybrid;
bool auxtrace;
int max_precise;
struct perf_event_attr *default_config;
@@ -42,9 +44,11 @@ struct perf_pmu {
struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
struct list_head list; /* ELEM */
+ struct list_head hybrid_list;
};

extern struct perf_pmu perf_pmu__fake;
+extern struct list_head perf_pmu__hybrid_pmus;

struct perf_pmu_info {
const char *unit;
@@ -124,4 +128,7 @@ int perf_pmu__convert_scale(const char *scale, char **end, double *sval);

int perf_pmu__caps_parse(struct perf_pmu *pmu);

+#define perf_pmu__for_each_hybrid_pmus(pmu) \
+ list_for_each_entry(pmu, &perf_pmu__hybrid_pmus, hybrid_list)
+
#endif /* __PMU_H */
--
2.7.4

2021-02-08 17:59:06

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 03/49] perf/x86/intel: Hybrid PMU support for perf capabilities

From: Kan Liang <[email protected]>

Some platforms, e.g. Alder Lake, have hybrid architecture. Although most
PMU capabilities are the same, there are still some unique PMU
capabilities for different hybrid PMUs. Perf should register a dedicated
pmu for each hybrid PMU.

Add a new struct x86_hybrid_pmu, which saves the dedicated pmu and
capabilities for each hybrid PMU.

The 'hybrid_pmu_idx' is introduced in the per-CPU struct cpu_hw_events
to indicate the index of the hybrid PMU for this CPU.

The architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicates the
architecture features which are available on all hybrid PMUs. The
architecture features are stored in the global x86_pmu.intel_cap.

For Alder Lake, the model-specific features are perf metrics and
PEBS-via-PT. The corresponding bits of the global x86_pmu.intel_cap
should be 0 for these two features. Perf should not use the global
intel_cap to check the features on a hybrid system.
Add a dedicated intel_cap in the x86_hybrid_pmu to store the
model-specific capabilities. Use the dedicated intel_cap to replace
the global intel_cap for thse two features. The dedicated intel_cap
will be set in the following "Add Alder Lake Hybrid support" patch.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 7 +++++--
arch/x86/events/intel/core.c | 31 ++++++++++++++++++++++++++-----
arch/x86/events/intel/ds.c | 2 +-
arch/x86/events/perf_event.h | 38 ++++++++++++++++++++++++++++++++++++++
arch/x86/include/asm/msr-index.h | 2 ++
5 files changed, 72 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6ddeed3..334553f 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -48,6 +48,7 @@ struct x86_pmu x86_pmu __read_mostly;

DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = {
.enabled = 1,
+ .hybrid_pmu_idx = X86_NON_HYBRID_PMU,
};

DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key);
@@ -1092,8 +1093,9 @@ static void del_nr_metric_event(struct cpu_hw_events *cpuc,
static int collect_event(struct cpu_hw_events *cpuc, struct perf_event *event,
int max_count, int n)
{
+ union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);

- if (x86_pmu.intel_cap.perf_metrics && add_nr_metric_event(cpuc, event))
+ if (intel_cap.perf_metrics && add_nr_metric_event(cpuc, event))
return -EINVAL;

if (n >= max_count + cpuc->n_metric)
@@ -1569,6 +1571,7 @@ void x86_pmu_stop(struct perf_event *event, int flags)
static void x86_pmu_del(struct perf_event *event, int flags)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);
int i;

/*
@@ -1608,7 +1611,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
}
cpuc->event_constraint[i-1] = NULL;
--cpuc->n_events;
- if (x86_pmu.intel_cap.perf_metrics)
+ if (intel_cap.perf_metrics)
del_nr_metric_event(cpuc, event);

perf_event_update_userpage(event);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 67a7246..4d026f6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3646,6 +3646,19 @@ static inline bool is_mem_loads_aux_event(struct perf_event *event)
return (event->attr.config & INTEL_ARCH_EVENT_MASK) == X86_CONFIG(.event=0x03, .umask=0x82);
}

+static inline bool intel_pmu_has_cap(struct perf_event *event, int idx)
+{
+ struct x86_hybrid_pmu *pmu;
+
+ if (!IS_X86_HYBRID)
+ return test_bit(idx, (unsigned long *)&x86_pmu.intel_cap.capabilities);
+
+ pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+ if (test_bit(idx, (unsigned long *)&pmu->intel_cap.capabilities))
+ return true;
+
+ return false;
+}

static int intel_pmu_hw_config(struct perf_event *event)
{
@@ -3709,7 +3722,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
* with a slots event as group leader. When the slots event
* is used in a metrics group, it too cannot support sampling.
*/
- if (x86_pmu.intel_cap.perf_metrics && is_topdown_event(event)) {
+ if (intel_pmu_has_cap(event, PERF_CAP_METRICS_IDX) && is_topdown_event(event)) {
if (event->attr.config1 || event->attr.config2)
return -EINVAL;

@@ -4216,8 +4229,16 @@ static void intel_pmu_cpu_starting(int cpu)
if (x86_pmu.version > 1)
flip_smm_bit(&x86_pmu.attr_freeze_on_smi);

- /* Disable perf metrics if any added CPU doesn't support it. */
- if (x86_pmu.intel_cap.perf_metrics) {
+ /*
+ * Disable perf metrics if any added CPU doesn't support it.
+ *
+ * Turn off the check for a hybrid architecture, because the
+ * architecture MSR, MSR_IA32_PERF_CAPABILITIES, only indicate
+ * the architecture features. The perf metrics is a model-specific
+ * feature for now. The corresponding bit should always be 0 on
+ * a hybrid platform, e.g., Alder Lake.
+ */
+ if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics) {
union perf_capabilities perf_cap;

rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
@@ -4327,7 +4348,7 @@ static int intel_pmu_check_period(struct perf_event *event, u64 value)

static int intel_pmu_aux_output_match(struct perf_event *event)
{
- if (!x86_pmu.intel_cap.pebs_output_pt_available)
+ if (!intel_pmu_has_cap(event, PERF_CAP_PT_IDX))
return 0;

return is_intel_pt_event(event);
@@ -5764,7 +5785,7 @@ __init int intel_pmu_init(void)
pr_cont("full-width counters, ");
}

- if (x86_pmu.intel_cap.perf_metrics)
+ if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics)
x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;

return 0;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 7ebae18..ba7cf05 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2205,7 +2205,7 @@ void __init intel_ds_init(void)
}
pr_cont("PEBS fmt4%c%s, ", pebs_type, pebs_qual);

- if (x86_pmu.intel_cap.pebs_output_pt_available) {
+ if (!IS_X86_HYBRID && x86_pmu.intel_cap.pebs_output_pt_available) {
pr_cont("PEBS-via-PT, ");
x86_get_pmu()->capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
}
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 53b2b5f..a53d4dd 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -327,6 +327,11 @@ struct cpu_hw_events {
int n_pair; /* Large increment events */

void *kfree_on_online[X86_PERF_KFREE_MAX];
+
+ /*
+ * Hybrid PMU support
+ */
+ int hybrid_pmu_idx;
};

#define __EVENT_CONSTRAINT_RANGE(c, e, n, m, w, o, f) { \
@@ -630,6 +635,30 @@ enum {
x86_lbr_exclusive_max,
};

+enum x86_hybrid_pmu_type_idx {
+ X86_NON_HYBRID_PMU = -1,
+ X86_HYBRID_PMU_ATOM_IDX = 0,
+ X86_HYBRID_PMU_CORE_IDX,
+
+ X86_HYBRID_PMU_MAX_INDEX
+};
+
+
+struct x86_hybrid_pmu {
+ struct pmu pmu;
+ union perf_capabilities intel_cap;
+};
+
+#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
+
+#define HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) \
+ (IS_X86_HYBRID && \
+ ((_cpuc)->hybrid_pmu_idx >= X86_HYBRID_PMU_ATOM_IDX) && \
+ ((_cpuc)->hybrid_pmu_idx < X86_HYBRID_PMU_MAX_INDEX))
+
+#define X86_HYBRID_READ_FROM_CPUC(_name, _cpuc) \
+ (_cpuc && HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) ? x86_pmu.hybrid_pmu[(_cpuc)->hybrid_pmu_idx]._name : x86_pmu._name)
+
/*
* struct x86_pmu - generic x86 pmu
*/
@@ -816,6 +845,15 @@ struct x86_pmu {
int (*check_period) (struct perf_event *event, u64 period);

int (*aux_output_match) (struct perf_event *event);
+
+ /*
+ * Hybrid support
+ *
+ * Most PMU capabilities are the same among different hybrid PMUs. The
+ * global x86_pmu saves the architecture capabilities, which are available
+ * for all PMUs. The hybrid_pmu only includes the unique capabilities.
+ */
+ struct x86_hybrid_pmu hybrid_pmu[X86_HYBRID_PMU_MAX_INDEX];
};

struct x86_perf_task_context_opt {
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ec..c6d7247 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -185,6 +185,8 @@
#define MSR_PEBS_DATA_CFG 0x000003f2
#define MSR_IA32_DS_AREA 0x00000600
#define MSR_IA32_PERF_CAPABILITIES 0x00000345
+#define PERF_CAP_METRICS_IDX 15
+#define PERF_CAP_PT_IDX 16
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6

#define MSR_IA32_RTIT_CTL 0x00000570
--
2.7.4

2021-02-08 17:59:50

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 06/49] perf/x86: Hybrid PMU support for unconstrained

From: Kan Liang <[email protected]>

The unconstrained value depends on the number of GP and fixed counters.
Each hybrid PMU should use its own unconstrained.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 5 ++++-
arch/x86/events/perf_event.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 3b8d728..9baa6b6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3147,7 +3147,10 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
}
}

- return &unconstrained;
+ if (!HAS_VALID_HYBRID_PMU_IN_CPUC(cpuc))
+ return &unconstrained;
+
+ return &x86_pmu.hybrid_pmu[cpuc->hybrid_pmu_idx].unconstrained;
}

static struct event_constraint *
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index bda4bdc..f11dbc4 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -651,6 +651,7 @@ struct x86_hybrid_pmu {
int max_pebs_events;
int num_counters;
int num_counters_fixed;
+ struct event_constraint unconstrained;
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
--
2.7.4

2021-02-08 18:02:01

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 05/49] perf/x86: Hybrid PMU support for counters

From: Kan Liang <[email protected]>

The number of GP and fixed counters are different among hybrid PMUs.
Each hybrid PMU should use its own counter related information.

When handling a certain hybrid PMU, apply the number of counters from
the corresponding hybrid PMU.

When reserving the counters in the initialization of a new event,
reserve all possible counters. Add a hybrid_pmu_bitmap to indicate the
possible hybrid PMUs. For the non-hybrid architecture, it's empty.

The number of counter recored in the global x86_pmu is for the
architecture counters which are available for all hybrid PMUs. KVM
doesn't support the hybrid PMU yet. Return the number of the
architecture counters for now.

For the functions only available for the old platforms, e.g.,
intel_pmu_drain_pebs_nhm(), nothing is changed.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 55 ++++++++++++++++++++++++++++++--------------
arch/x86/events/intel/core.c | 8 ++++---
arch/x86/events/intel/ds.c | 14 +++++++----
arch/x86/events/perf_event.h | 5 ++++
4 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 170acbf..5f79b37 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -179,16 +179,29 @@ static DEFINE_MUTEX(pmc_reserve_mutex);

#ifdef CONFIG_X86_LOCAL_APIC

+static inline int get_possible_num_counters(void)
+{
+ int bit, num_counters = 0;
+
+ if (!IS_X86_HYBRID)
+ return x86_pmu.num_counters;
+
+ for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX)
+ num_counters = max_t(int, num_counters, x86_pmu.hybrid_pmu[bit].num_counters);
+
+ return num_counters;
+}
+
static bool reserve_pmc_hardware(void)
{
- int i;
+ int i, num_counters = get_possible_num_counters();

- for (i = 0; i < x86_pmu.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
if (!reserve_perfctr_nmi(x86_pmu_event_addr(i)))
goto perfctr_fail;
}

- for (i = 0; i < x86_pmu.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
if (!reserve_evntsel_nmi(x86_pmu_config_addr(i)))
goto eventsel_fail;
}
@@ -199,7 +212,7 @@ static bool reserve_pmc_hardware(void)
for (i--; i >= 0; i--)
release_evntsel_nmi(x86_pmu_config_addr(i));

- i = x86_pmu.num_counters;
+ i = num_counters;

perfctr_fail:
for (i--; i >= 0; i--)
@@ -210,9 +223,9 @@ static bool reserve_pmc_hardware(void)

static void release_pmc_hardware(void)
{
- int i;
+ int i, num_counters = get_possible_num_counters();

- for (i = 0; i < x86_pmu.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
release_perfctr_nmi(x86_pmu_event_addr(i));
release_evntsel_nmi(x86_pmu_config_addr(i));
}
@@ -933,6 +946,7 @@ EXPORT_SYMBOL_GPL(perf_assign_events);

int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
{
+ int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
struct event_constraint *c;
struct perf_event *e;
int n0, i, wmin, wmax, unsched = 0;
@@ -1008,7 +1022,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)

/* slow path */
if (i != n) {
- int gpmax = x86_pmu.num_counters;
+ int gpmax = num_counters;

/*
* Do not allow scheduling of more than half the available
@@ -1029,7 +1043,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
* the extra Merge events needed by large increment events.
*/
if (x86_pmu.flags & PMU_FL_PAIR) {
- gpmax = x86_pmu.num_counters - cpuc->n_pair;
+ gpmax = num_counters - cpuc->n_pair;
WARN_ON(gpmax <= 0);
}

@@ -1116,10 +1130,12 @@ static int collect_event(struct cpu_hw_events *cpuc, struct perf_event *event,
*/
static int collect_events(struct cpu_hw_events *cpuc, struct perf_event *leader, bool dogrp)
{
+ int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
+ int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
struct perf_event *event;
int n, max_count;

- max_count = x86_pmu.num_counters + x86_pmu.num_counters_fixed;
+ max_count = num_counters + num_counters_fixed;

/* current number of events already accepted */
n = cpuc->n_events;
@@ -1487,18 +1503,18 @@ void perf_event_print_debug(void)
{
u64 ctrl, status, overflow, pmc_ctrl, pmc_count, prev_left, fixed;
u64 pebs, debugctl;
- struct cpu_hw_events *cpuc;
+ int cpu = smp_processor_id();
+ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+ int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
+ int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
unsigned long flags;
- int cpu, idx;
+ int idx;

- if (!x86_pmu.num_counters)
+ if (!num_counters)
return;

local_irq_save(flags);

- cpu = smp_processor_id();
- cpuc = &per_cpu(cpu_hw_events, cpu);
-
if (x86_pmu.version >= 2) {
rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, ctrl);
rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, status);
@@ -1521,7 +1537,7 @@ void perf_event_print_debug(void)
}
pr_info("CPU#%d: active: %016llx\n", cpu, *(u64 *)cpuc->active_mask);

- for (idx = 0; idx < x86_pmu.num_counters; idx++) {
+ for (idx = 0; idx < num_counters; idx++) {
rdmsrl(x86_pmu_config_addr(idx), pmc_ctrl);
rdmsrl(x86_pmu_event_addr(idx), pmc_count);

@@ -1534,7 +1550,7 @@ void perf_event_print_debug(void)
pr_info("CPU#%d: gen-PMC%d left: %016llx\n",
cpu, idx, prev_left);
}
- for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
+ for (idx = 0; idx < num_counters_fixed; idx++) {
if (fixed_counter_disabled(idx, cpuc))
continue;
rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, pmc_count);
@@ -2776,6 +2792,11 @@ unsigned long perf_misc_flags(struct pt_regs *regs)
void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
{
cap->version = x86_pmu.version;
+ /*
+ * KVM doesn't support the hybrid PMU yet.
+ * Return the common value in global x86_pmu,
+ * which available for all cores.
+ */
cap->num_counters_gp = x86_pmu.num_counters;
cap->num_counters_fixed = x86_pmu.num_counters_fixed;
cap->bit_width_gp = x86_pmu.cntval_bits;
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 1b9563c..3b8d728 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2711,21 +2711,23 @@ static void intel_pmu_reset(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
+ int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
unsigned long flags;
int idx;

- if (!x86_pmu.num_counters)
+ if (!num_counters)
return;

local_irq_save(flags);

pr_info("clearing PMU state on CPU#%d\n", smp_processor_id());

- for (idx = 0; idx < x86_pmu.num_counters; idx++) {
+ for (idx = 0; idx < num_counters; idx++) {
wrmsrl_safe(x86_pmu_config_addr(idx), 0ull);
wrmsrl_safe(x86_pmu_event_addr(idx), 0ull);
}
- for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
+ for (idx = 0; idx < num_counters_fixed; idx++) {
if (fixed_counter_disabled(idx, cpuc))
continue;
wrmsrl_safe(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, 0ull);
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index ba7cf05..a528966 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1007,6 +1007,8 @@ void intel_pmu_pebs_sched_task(struct perf_event_context *ctx, bool sched_in)
static inline void pebs_update_threshold(struct cpu_hw_events *cpuc)
{
struct debug_store *ds = cpuc->ds;
+ int max_pebs_events = X86_HYBRID_READ_FROM_CPUC(max_pebs_events, cpuc);
+ int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
u64 threshold;
int reserved;

@@ -1014,9 +1016,9 @@ static inline void pebs_update_threshold(struct cpu_hw_events *cpuc)
return;

if (x86_pmu.flags & PMU_FL_PEBS_ALL)
- reserved = x86_pmu.max_pebs_events + x86_pmu.num_counters_fixed;
+ reserved = max_pebs_events + num_counters_fixed;
else
- reserved = x86_pmu.max_pebs_events;
+ reserved = max_pebs_events;

if (cpuc->n_pebs == cpuc->n_large_pebs) {
threshold = ds->pebs_absolute_maximum -
@@ -2072,6 +2074,8 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d
{
short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ int max_pebs_events = X86_HYBRID_READ_FROM_CPUC(max_pebs_events, cpuc);
+ int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
struct debug_store *ds = cpuc->ds;
struct perf_event *event;
void *base, *at, *top;
@@ -2086,9 +2090,9 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d

ds->pebs_index = ds->pebs_buffer_base;

- mask = ((1ULL << x86_pmu.max_pebs_events) - 1) |
- (((1ULL << x86_pmu.num_counters_fixed) - 1) << INTEL_PMC_IDX_FIXED);
- size = INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed;
+ mask = ((1ULL << max_pebs_events) - 1) |
+ (((1ULL << num_counters_fixed) - 1) << INTEL_PMC_IDX_FIXED);
+ size = INTEL_PMC_IDX_FIXED + num_counters_fixed;

if (unlikely(base >= top)) {
intel_pmu_pebs_event_update_no_drain(cpuc, size);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index b939784..bda4bdc 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -648,6 +648,9 @@ struct x86_hybrid_pmu {
struct pmu pmu;
union perf_capabilities intel_cap;
u64 intel_ctrl;
+ int max_pebs_events;
+ int num_counters;
+ int num_counters_fixed;
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
@@ -853,7 +856,9 @@ struct x86_pmu {
* Most PMU capabilities are the same among different hybrid PMUs. The
* global x86_pmu saves the architecture capabilities, which are available
* for all PMUs. The hybrid_pmu only includes the unique capabilities.
+ * The hybrid_pmu_bitmap is the bits map of the possible hybrid_pmu.
*/
+ unsigned long hybrid_pmu_bitmap;
struct x86_hybrid_pmu hybrid_pmu[X86_HYBRID_PMU_MAX_INDEX];
};

--
2.7.4

2021-02-08 18:02:15

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 34/49] tools headers uapi: Update tools's copy of linux/perf_event.h

From: Jin Yao <[email protected]>

To get the changes in:

("perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU")

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/include/uapi/linux/perf_event.h | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 7d292de5..83ab6a6 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -33,6 +33,8 @@ enum perf_type_id {
PERF_TYPE_HW_CACHE = 3,
PERF_TYPE_RAW = 4,
PERF_TYPE_BREAKPOINT = 5,
+ PERF_TYPE_HARDWARE_PMU = 6,
+ PERF_TYPE_HW_CACHE_PMU = 7,

PERF_TYPE_MAX, /* non-ABI */
};
@@ -95,6 +97,30 @@ enum perf_hw_cache_op_result_id {
};

/*
+ * attr.config layout for type PERF_TYPE_HARDWARE* and PERF_TYPE_HW_CACHE*
+ * PERF_TYPE_HARDWARE: 0xAA
+ * AA: hardware event ID
+ * PERF_TYPE_HW_CACHE: 0xCCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
+ * AA: hardware event ID
+ * DD: PMU type ID
+ * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * DD: PMU type ID
+ */
+#define PERF_HW_CACHE_ID_SHIFT 0
+#define PERF_HW_CACHE_OP_ID_SHIFT 8
+#define PERF_HW_CACHE_OP_RESULT_ID_SHIFT 16
+#define PERF_HW_CACHE_EVENT_MASK 0xffffff
+
+#define PERF_PMU_TYPE_SHIFT 32
+
+/*
* Special "software" events provided by the kernel, even if the hardware
* does not support performance events. These events measure various
* physical and sw events of the kernel (and allow the profiling of them as
--
2.7.4

2021-02-08 18:02:24

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 07/49] perf/x86: Hybrid PMU support for hardware cache event

From: Kan Liang <[email protected]>

The hardware cache events are different among hybrid PMUs. Each hybrid
PMU should have its own hw cache event table.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 11 +++++++++--
arch/x86/events/perf_event.h | 9 +++++++++
2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 5f79b37..27c87a7 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -351,6 +351,7 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
unsigned int cache_type, cache_op, cache_result;
+ struct x86_hybrid_pmu *pmu = IS_X86_HYBRID ? container_of(event->pmu, struct x86_hybrid_pmu, pmu) : NULL;
u64 config, val;

config = attr->config;
@@ -370,7 +371,10 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf_event *event)
return -EINVAL;
cache_result = array_index_nospec(cache_result, PERF_COUNT_HW_CACHE_RESULT_MAX);

- val = hw_cache_event_ids[cache_type][cache_op][cache_result];
+ if (pmu)
+ val = pmu->hw_cache_event_ids[cache_type][cache_op][cache_result];
+ else
+ val = hw_cache_event_ids[cache_type][cache_op][cache_result];

if (val == 0)
return -ENOENT;
@@ -379,7 +383,10 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf_event *event)
return -EINVAL;

hwc->config |= val;
- attr->config1 = hw_cache_extra_regs[cache_type][cache_op][cache_result];
+ if (pmu)
+ attr->config1 = pmu->hw_cache_extra_regs[cache_type][cache_op][cache_result];
+ else
+ attr->config1 = hw_cache_extra_regs[cache_type][cache_op][cache_result];
return x86_pmu_extra_regs(val, event);
}

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f11dbc4..00fcd92 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -652,6 +652,15 @@ struct x86_hybrid_pmu {
int num_counters;
int num_counters_fixed;
struct event_constraint unconstrained;
+
+ u64 hw_cache_event_ids
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
+ u64 hw_cache_extra_regs
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX];
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
--
2.7.4

2021-02-08 18:04:20

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

From: Ricardo Neri <[email protected]>

On processors with Intel Hybrid Technology (i.e., one having more than one
type of CPU in the same package), all CPUs support the same instruction
set and enumerate the same features on CPUID. Thus, all software can run
on any CPU without restrictions. However, there may be model-specific
differences among types of CPUs. For instance, each type of CPU may support
a different number of performance counters. Also, machine check error banks
may be wired differently. Even though most software will not care about
these differences, kernel subsystems dealing with these differences must
know. Add a new member to cpuinfo_x86 that subsystems can query to know
the type of CPU.

Hybrid processors also have a native model ID to uniquely identify the
micro-architecture of each CPU. Please note that the native model ID is not
related with the existing x86_model_id read from CPUID leaf 0x1.

In order to uniquely identify a CPU by type and micro-architecture, combine
the aforementioned identifiers into a single new member, x86_cpu_type.

Define also masks that subsystems can use to obtain the CPU type or native
model separately.

The Intel Software Developer's Manual defines the CPU type and the CPU
native model ID as 8-bit and 24-bit identifiers, respectively.

Cc: Andi Kleen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: "Ravi V. Shankar" <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Srinivas Pandruvada <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Tony Luck <[email protected]>
Reviewed-by: Len Brown <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Signed-off-by: Ricardo Neri <[email protected]>
---
arch/x86/include/asm/processor.h | 13 +++++++++++++
arch/x86/kernel/cpu/common.c | 3 +++
2 files changed, 16 insertions(+)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c20a52b..1f25ac9 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -139,6 +139,16 @@ struct cpuinfo_x86 {
u32 microcode;
/* Address space bits used by the cache internally */
u8 x86_cache_bits;
+ /*
+ * In hybrid processors, there is a CPU type and a native model ID. The
+ * CPU type (x86_cpu_type[31:24]) describes the type of micro-
+ * architecture families. The native model ID (x86_cpu_type[23:0])
+ * describes a specific microarchitecture version. Combining both
+ * allows to uniquely identify a CPU.
+ *
+ * Please note that the native model ID is not related to x86_model.
+ */
+ u32 x86_cpu_type;
unsigned initialized : 1;
} __randomize_layout;

@@ -166,6 +176,9 @@ enum cpuid_regs_idx {

#define X86_VENDOR_UNKNOWN 0xff

+#define X86_HYBRID_CPU_TYPE_ID_SHIFT 24
+#define X86_HYBRID_CPU_NATIVE_MODEL_ID_MASK 0xffffff
+
/*
* capabilities of CPUs
*/
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 35ad848..a66c1fd 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -932,6 +932,9 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_D_1_EAX] = eax;
}

+ if (cpu_has(c, X86_FEATURE_HYBRID_CPU))
+ c->x86_cpu_type = cpuid_eax(0x0000001a);
+
/* AMD-defined flags: level 0x80000001 */
eax = cpuid_eax(0x80000000);
c->extended_cpuid_level = eax;
--
2.7.4

2021-02-08 18:04:44

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 36/49] perf parse-events: Create two hybrid cache events

From: Jin Yao <[email protected]>

For cache events, they have pre-defined configs. The kernel needs
to know where the cache event comes from (e.g. from cpu_core pmu
or from cpu_atom pmu). But the perf type 'PERF_TYPE_HW_CACHE'
can't carry pmu information.

So the kernel introduces a new type 'PERF_TYPE_HW_CACHE_PMU'.

The new attr.config layout for PERF_TYPE_HW_CACHE_PMU is

0xDD00CCBBAA
0xAA: hardware cache ID
0xBB: hardware cache op ID
0xCC: hardware cache op result ID
0xDD: PMU type ID

Similar as hardware event, PMU type ID is retrieved from sysfs.

When enabling a hybrid cache event without specified pmu, such as,
'perf stat -e L1-dcache-loads -a', two events are created
automatically. One is for atom, the other is for core.

root@otcpl-adl-s-2:~# ./perf stat -e L1-dcache-loads -vv -a -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 7
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
------------------------------------------------------------
perf_event_attr:
type 7
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
L1-dcache-loads: 0: 77398 1001256700 1001256700
L1-dcache-loads: 1: 5286 1001255101 1001255101
L1-dcache-loads: 2: 26432 1001280449 1001280449
L1-dcache-loads: 3: 2853 1001274145 1001274145
L1-dcache-loads: 4: 521391 1001304618 1001304618
L1-dcache-loads: 5: 1231 1001287686 1001287686
L1-dcache-loads: 6: 1237 1001284439 1001284439
L1-dcache-loads: 7: 1384 1001278646 1001278646
L1-dcache-loads: 8: 1238 1001274988 1001274988
L1-dcache-loads: 9: 1225 1001267988 1001267988
L1-dcache-loads: 10: 88066 1001301843 1001301843
L1-dcache-loads: 11: 1243 1001308922 1001308922
L1-dcache-loads: 12: 1231 1001313498 1001313498
L1-dcache-loads: 13: 12880 1001306597 1001306597
L1-dcache-loads: 14: 21244 1001293603 1001293603
L1-dcache-loads: 15: 1225 1001287958 1001287958
L1-dcache-loads: 0: 1244 1001289333 1001289333
L1-dcache-loads: 1: 1361 1001288189 1001288189
L1-dcache-loads: 2: 1226 1001285926 1001285926
L1-dcache-loads: 3: 1226 1001289431 1001289431
L1-dcache-loads: 4: 1239 1001283299 1001283299
L1-dcache-loads: 5: 10500 1001318113 1001318113
L1-dcache-loads: 6: 1226 1001315332 1001315332
L1-dcache-loads: 7: 1226 1001325366 1001325366
L1-dcache-loads: 765564 16020577181 16020577181
L1-dcache-loads: 19248 8010394989 8010394989

Performance counter stats for 'system wide':

765,564 L1-dcache-loads
19,248 L1-dcache-loads

1.002255760 seconds time elapsed

type 7 is PERF_TYPE_HW_CACHE_PMU.
0x4 in 0x400000000 indicates the cpu_core pmu.
0xa in 0xa00000000 indicates the cpu_atom pmu.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/parse-events.c | 54 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1e767dc..28d356e 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -464,6 +464,48 @@ static void config_hybrid_attr(struct perf_event_attr *attr,
attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
}

+static int create_hybrid_cache_event(struct list_head *list, int *idx,
+ struct perf_event_attr *attr, char *name,
+ struct list_head *config_terms,
+ struct perf_pmu *pmu)
+{
+ struct evsel *evsel;
+ __u32 type = attr->type;
+ __u64 config = attr->config;
+
+ config_hybrid_attr(attr, PERF_TYPE_HW_CACHE_PMU, pmu->type);
+ evsel = __add_event(list, idx, attr, true, name,
+ pmu, config_terms, false, NULL);
+ if (evsel)
+ evsel->pmu_name = strdup(pmu->name);
+ else
+ return -ENOMEM;
+
+ attr->type = type;
+ attr->config = config;
+ return 0;
+}
+
+static int add_hybrid_cache(struct list_head *list, int *idx,
+ struct perf_event_attr *attr, char *name,
+ struct list_head *config_terms,
+ bool *hybrid)
+{
+ struct perf_pmu *pmu;
+ int ret;
+
+ *hybrid = false;
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ *hybrid = true;
+ ret = create_hybrid_cache_event(list, idx, attr, name,
+ config_terms, pmu);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2,
struct parse_events_error *err,
@@ -474,7 +516,8 @@ int parse_events_add_cache(struct list_head *list, int *idx,
char name[MAX_NAME_LEN], *config_name;
int cache_type = -1, cache_op = -1, cache_result = -1;
char *op_result[2] = { op_result1, op_result2 };
- int i, n;
+ int i, n, ret;
+ bool hybrid;

/*
* No fallback - if we cannot get a clear cache type
@@ -534,6 +577,15 @@ int parse_events_add_cache(struct list_head *list, int *idx,
if (get_config_terms(head_config, &config_terms))
return -ENOMEM;
}
+
+ if (!perf_pmu__hybrid_exist())
+ perf_pmu__scan(NULL);
+
+ ret = add_hybrid_cache(list, idx, &attr, config_name ? : name,
+ &config_terms, &hybrid);
+ if (hybrid)
+ return ret;
+
return add_event(list, idx, &attr, config_name ? : name, &config_terms);
}

--
2.7.4

2021-02-08 18:04:50

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 30/49] perf list: Support --cputype option to list hybrid pmu events

From: Jin Yao <[email protected]>

This patch supports a new option '--cputype' to list core only
pmu events or atom only pmu events.

For example,

perf list --cputype atom
...
cache:
core_reject_l2q.any
[Counts the number of request that were not accepted into the L2Q because the L2Q is FULL. Unit: cpu_atom]
dl1.dirty_eviction
[Counts all L1D cacheline (dirty) evictions caused by miss, stores, and prefetches. Does not count evictions or dirty writebacks caused by snoops.
Does not count a replacement unless a (dirty) line was written back. Unit: cpu_atom]
...

We can see "Unit: cpu_atom" is displayed in the brief description
section which indicate the event is atom event.

There are some duplicated events, e.g. inst_retired.any. This patch
lists them all.

perf list
...
inst_retired.any
[Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
inst_retired.any
[Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]

One "inst_retired.any" is available on cpu_atom, another "inst_retired.any" is
available on cpu_core.

Each hybrid pmu event has been assigned with a pmu, this patch just compares
the pmu name before listing the result.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/Documentation/perf-list.txt | 4 ++++
tools/perf/builtin-list.c | 42 ++++++++++++++++++++++++----------
tools/perf/util/metricgroup.c | 6 ++++-
tools/perf/util/metricgroup.h | 2 +-
tools/perf/util/parse-events.c | 8 ++++---
tools/perf/util/parse-events.h | 3 ++-
tools/perf/util/pmu.c | 30 ++++++++++++++++++++----
tools/perf/util/pmu.h | 2 +-
8 files changed, 73 insertions(+), 24 deletions(-)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 4c7db1d..4dc8d0a 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -39,6 +39,10 @@ any extra expressions computed by perf stat.
--deprecated::
Print deprecated events. By default the deprecated events are hidden.

+--cputype::
+Print events applying cpu with this type for hybrid platform
+(e.g. --cputype core or --cputype atom)
+
[[EVENT_MODIFIERS]]
EVENT MODIFIERS
---------------
diff --git a/tools/perf/builtin-list.c b/tools/perf/builtin-list.c
index 10ab5e4..5e4bef8 100644
--- a/tools/perf/builtin-list.c
+++ b/tools/perf/builtin-list.c
@@ -17,16 +17,19 @@
#include <subcmd/pager.h>
#include <subcmd/parse-options.h>
#include <stdio.h>
+#include <asm/bug.h>

static bool desc_flag = true;
static bool details_flag;
+static const char *hybrid_type;

int cmd_list(int argc, const char **argv)
{
- int i;
+ int i, ret = 0;
bool raw_dump = false;
bool long_desc_flag = false;
bool deprecated = false;
+ char *pmu_name = NULL;
struct option list_options[] = {
OPT_BOOLEAN(0, "raw-dump", &raw_dump, "Dump raw events"),
OPT_BOOLEAN('d', "desc", &desc_flag,
@@ -37,6 +40,9 @@ int cmd_list(int argc, const char **argv)
"Print information on the perf event names and expressions used internally by events."),
OPT_BOOLEAN(0, "deprecated", &deprecated,
"Print deprecated events."),
+ OPT_STRING(0, "cputype", &hybrid_type, "hybrid cpu type",
+ "Print events applying cpu with this type for hybrid platform "
+ "(e.g. core or atom)"),
OPT_INCR(0, "debug", &verbose,
"Enable debugging output"),
OPT_END()
@@ -56,10 +62,16 @@ int cmd_list(int argc, const char **argv)
if (!raw_dump && pager_in_use())
printf("\nList of pre-defined events (to be used in -e):\n\n");

+ if (hybrid_type) {
+ pmu_name = perf_pmu__hybrid_type_to_pmu(hybrid_type);
+ if (!pmu_name)
+ WARN_ONCE(1, "WARNING: hybrid cputype is not supported!\n");
+ }
+
if (argc == 0) {
print_events(NULL, raw_dump, !desc_flag, long_desc_flag,
- details_flag, deprecated);
- return 0;
+ details_flag, deprecated, pmu_name);
+ goto out;
}

for (i = 0; i < argc; ++i) {
@@ -82,25 +94,27 @@ int cmd_list(int argc, const char **argv)
else if (strcmp(argv[i], "pmu") == 0)
print_pmu_events(NULL, raw_dump, !desc_flag,
long_desc_flag, details_flag,
- deprecated);
+ deprecated, pmu_name);
else if (strcmp(argv[i], "sdt") == 0)
print_sdt_events(NULL, NULL, raw_dump);
else if (strcmp(argv[i], "metric") == 0 || strcmp(argv[i], "metrics") == 0)
- metricgroup__print(true, false, NULL, raw_dump, details_flag);
+ metricgroup__print(true, false, NULL, raw_dump, details_flag, pmu_name);
else if (strcmp(argv[i], "metricgroup") == 0 || strcmp(argv[i], "metricgroups") == 0)
- metricgroup__print(false, true, NULL, raw_dump, details_flag);
+ metricgroup__print(false, true, NULL, raw_dump, details_flag, pmu_name);
else if ((sep = strchr(argv[i], ':')) != NULL) {
int sep_idx;

sep_idx = sep - argv[i];
s = strdup(argv[i]);
- if (s == NULL)
- return -1;
+ if (s == NULL) {
+ ret = -1;
+ goto out;
+ }

s[sep_idx] = '\0';
print_tracepoint_events(s, s + sep_idx + 1, raw_dump);
print_sdt_events(s, s + sep_idx + 1, raw_dump);
- metricgroup__print(true, true, s, raw_dump, details_flag);
+ metricgroup__print(true, true, s, raw_dump, details_flag, pmu_name);
free(s);
} else {
if (asprintf(&s, "*%s*", argv[i]) < 0) {
@@ -116,12 +130,16 @@ int cmd_list(int argc, const char **argv)
print_pmu_events(s, raw_dump, !desc_flag,
long_desc_flag,
details_flag,
- deprecated);
+ deprecated,
+ pmu_name);
print_tracepoint_events(NULL, s, raw_dump);
print_sdt_events(NULL, s, raw_dump);
- metricgroup__print(true, true, s, raw_dump, details_flag);
+ metricgroup__print(true, true, s, raw_dump, details_flag, pmu_name);
free(s);
}
}
- return 0;
+
+out:
+ free(pmu_name);
+ return ret;
}
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index ee94d3e..df05134 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -609,7 +609,7 @@ static int metricgroup__print_sys_event_iter(struct pmu_event *pe, void *data)
}

void metricgroup__print(bool metrics, bool metricgroups, char *filter,
- bool raw, bool details)
+ bool raw, bool details, const char *pmu_name)
{
struct pmu_events_map *map = perf_pmu__find_map(NULL);
struct pmu_event *pe;
@@ -635,6 +635,10 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
break;
if (!pe->metric_expr)
continue;
+ if (pmu_name && perf_pmu__is_hybrid(pe->pmu) &&
+ strcmp(pmu_name, pe->pmu)) {
+ continue;
+ }
if (metricgroup__print_pmu_event(pe, metricgroups, filter,
raw, details, &groups,
metriclist) < 0)
diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
index ed1b939..b03111b 100644
--- a/tools/perf/util/metricgroup.h
+++ b/tools/perf/util/metricgroup.h
@@ -53,7 +53,7 @@ int metricgroup__parse_groups_test(struct evlist *evlist,
struct rblist *metric_events);

void metricgroup__print(bool metrics, bool groups, char *filter,
- bool raw, bool details);
+ bool raw, bool details, const char *pmu_name);
bool metricgroup__has_metric(const char *metric);
int arch_get_runtimeparam(struct pmu_event *pe __maybe_unused);
void metricgroup__rblist_exit(struct rblist *metric_events);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 42c84ad..81a6fce 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -2893,7 +2893,8 @@ void print_symbol_events(const char *event_glob, unsigned type,
* Print the help text for the event symbols:
*/
void print_events(const char *event_glob, bool name_only, bool quiet_flag,
- bool long_desc, bool details_flag, bool deprecated)
+ bool long_desc, bool details_flag, bool deprecated,
+ const char *pmu_name)
{
print_symbol_events(event_glob, PERF_TYPE_HARDWARE,
event_symbols_hw, PERF_COUNT_HW_MAX, name_only);
@@ -2905,7 +2906,7 @@ void print_events(const char *event_glob, bool name_only, bool quiet_flag,
print_hwcache_events(event_glob, name_only);

print_pmu_events(event_glob, name_only, quiet_flag, long_desc,
- details_flag, deprecated);
+ details_flag, deprecated, pmu_name);

if (event_glob != NULL)
return;
@@ -2931,7 +2932,8 @@ void print_events(const char *event_glob, bool name_only, bool quiet_flag,

print_sdt_events(NULL, NULL, name_only);

- metricgroup__print(true, true, NULL, name_only, details_flag);
+ metricgroup__print(true, true, NULL, name_only, details_flag,
+ pmu_name);

print_libpfm_events(name_only, long_desc);
}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index e80c9b7..b875485 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -217,7 +217,8 @@ void parse_events_evlist_error(struct parse_events_state *parse_state,
int idx, const char *str);

void print_events(const char *event_glob, bool name_only, bool quiet,
- bool long_desc, bool details_flag, bool deprecated);
+ bool long_desc, bool details_flag, bool deprecated,
+ const char *pmu_name);

struct event_symbol {
const char *symbol;
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 04447f5..9a6c973 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1546,6 +1546,7 @@ static int cmp_sevent(const void *a, const void *b)
{
const struct sevent *as = a;
const struct sevent *bs = b;
+ int ret;

/* Put extra events last */
if (!!as->desc != !!bs->desc)
@@ -1561,7 +1562,13 @@ static int cmp_sevent(const void *a, const void *b)
if (as->is_cpu != bs->is_cpu)
return bs->is_cpu - as->is_cpu;

- return strcmp(as->name, bs->name);
+ ret = strcmp(as->name, bs->name);
+ if (!ret) {
+ if (as->pmu && bs->pmu)
+ return strcmp(as->pmu, bs->pmu);
+ }
+
+ return ret;
}

static void wordwrap(char *s, int start, int max, int corr)
@@ -1591,7 +1598,8 @@ bool is_pmu_core(const char *name)
}

void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
- bool long_desc, bool details_flag, bool deprecated)
+ bool long_desc, bool details_flag, bool deprecated,
+ const char *pmu_name)
{
struct perf_pmu *pmu;
struct perf_pmu_alias *alias;
@@ -1617,10 +1625,16 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
pmu = NULL;
j = 0;
while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+ if (pmu_name && perf_pmu__is_hybrid(pmu->name) &&
+ strcmp(pmu_name, pmu->name)) {
+ continue;
+ }
+
list_for_each_entry(alias, &pmu->aliases, list) {
char *name = alias->desc ? alias->name :
format_alias(buf, sizeof(buf), pmu, alias);
- bool is_cpu = is_pmu_core(pmu->name);
+ bool is_cpu = is_pmu_core(pmu->name) ||
+ perf_pmu__is_hybrid(pmu->name);

if (alias->deprecated && !deprecated)
continue;
@@ -1653,6 +1667,7 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
aliases[j].metric_expr = alias->metric_expr;
aliases[j].metric_name = alias->metric_name;
aliases[j].is_cpu = is_cpu;
+ aliases[j].pmu = alias->pmu;
j++;
}
if (pmu->selectable &&
@@ -1668,8 +1683,13 @@ void print_pmu_events(const char *event_glob, bool name_only, bool quiet_flag,
qsort(aliases, len, sizeof(struct sevent), cmp_sevent);
for (j = 0; j < len; j++) {
/* Skip duplicates */
- if (j > 0 && !strcmp(aliases[j].name, aliases[j - 1].name))
- continue;
+ if (j > 0 && !strcmp(aliases[j].name, aliases[j - 1].name)) {
+ if (!aliases[j].pmu || !aliases[j - 1].pmu ||
+ !strcmp(aliases[j].pmu, aliases[j - 1].pmu)) {
+ continue;
+ }
+ }
+
if (name_only) {
printf("%s ", aliases[j].name);
continue;
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index bb74595..5b727cf 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -107,7 +107,7 @@ struct perf_pmu *perf_pmu__scan(struct perf_pmu *pmu);
bool is_pmu_core(const char *name);
void print_pmu_events(const char *event_glob, bool name_only, bool quiet,
bool long_desc, bool details_flag,
- bool deprecated);
+ bool deprecated, const char *pmu_name);
bool pmu_have_event(const char *pname, const char *name);

int perf_pmu__scan_file(struct perf_pmu *pmu, const char *name, const char *fmt, ...) __scanf(3, 4);
--
2.7.4

2021-02-08 18:05:32

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 09/49] perf/x86: Hybrid PMU support for extra_regs

From: Kan Liang <[email protected]>

Different hybrid PMU may have different extra registers, e.g. Core PMU
may have offcore registers, frontend register and ldlat register. Atom
core may only have offcore registers and ldlat register. Each hybrid PMU
should use its own extra_regs.

An Intel Hybrid system should always have extra registers.
Unconditionally allocate shared_regs for Intel Hybrid system.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 5 +++--
arch/x86/events/intel/core.c | 15 +++++++++------
arch/x86/events/perf_event.h | 1 +
3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 2160142..6857934 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -148,15 +148,16 @@ u64 x86_perf_event_update(struct perf_event *event)
*/
static int x86_pmu_extra_regs(u64 config, struct perf_event *event)
{
+ struct extra_reg *extra_regs = X86_HYBRID_READ_FROM_EVENT(extra_regs, event);
struct hw_perf_event_extra *reg;
struct extra_reg *er;

reg = &event->hw.extra_reg;

- if (!x86_pmu.extra_regs)
+ if (!extra_regs)
return 0;

- for (er = x86_pmu.extra_regs; er->msr; er++) {
+ for (er = extra_regs; er->msr; er++) {
if (er->event != (config & er->config_mask))
continue;
if (event->attr.config1 & ~er->valid_mask)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9acfa82..582d191 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2966,8 +2966,10 @@ intel_vlbr_constraints(struct perf_event *event)
return NULL;
}

-static int intel_alt_er(int idx, u64 config)
+static int intel_alt_er(struct cpu_hw_events *cpuc,
+ int idx, u64 config)
{
+ struct extra_reg *extra_regs = X86_HYBRID_READ_FROM_CPUC(extra_regs, cpuc);
int alt_idx = idx;

if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
@@ -2979,7 +2981,7 @@ static int intel_alt_er(int idx, u64 config)
if (idx == EXTRA_REG_RSP_1)
alt_idx = EXTRA_REG_RSP_0;

- if (config & ~x86_pmu.extra_regs[alt_idx].valid_mask)
+ if (config & ~extra_regs[alt_idx].valid_mask)
return idx;

return alt_idx;
@@ -2987,15 +2989,16 @@ static int intel_alt_er(int idx, u64 config)

static void intel_fixup_er(struct perf_event *event, int idx)
{
+ struct extra_reg *extra_regs = X86_HYBRID_READ_FROM_EVENT(extra_regs, event);
event->hw.extra_reg.idx = idx;

if (idx == EXTRA_REG_RSP_0) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
- event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_0].event;
+ event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
} else if (idx == EXTRA_REG_RSP_1) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
- event->hw.config |= x86_pmu.extra_regs[EXTRA_REG_RSP_1].event;
+ event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
}
}
@@ -3071,7 +3074,7 @@ __intel_shared_reg_get_constraints(struct cpu_hw_events *cpuc,
*/
c = NULL;
} else {
- idx = intel_alt_er(idx, reg->config);
+ idx = intel_alt_er(cpuc, idx, reg->config);
if (idx != reg->idx) {
raw_spin_unlock_irqrestore(&era->lock, flags);
goto again;
@@ -4162,7 +4165,7 @@ int intel_cpuc_prepare(struct cpu_hw_events *cpuc, int cpu)
{
cpuc->pebs_record_size = x86_pmu.pebs_record_size;

- if (x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
+ if (IS_X86_HYBRID || x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
cpuc->shared_regs = allocate_shared_regs(cpu);
if (!cpuc->shared_regs)
goto err;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 7a5d036..109139c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -663,6 +663,7 @@ struct x86_hybrid_pmu {
[PERF_COUNT_HW_CACHE_RESULT_MAX];
struct event_constraint *event_constraints;
struct event_constraint *pebs_constraints;
+ struct extra_reg *extra_regs;
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
--
2.7.4

2021-02-08 18:05:43

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 11/49] perf/x86/intel: Factor out intel_pmu_check_event_constraints

From: Kan Liang <[email protected]>

Each Hybrid PMU has to check and update its own event constraints before
registration.

The intel_pmu_check_event_constraints will be reused later when
registering a dedicated hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 82 +++++++++++++++++++++++++-------------------
1 file changed, 47 insertions(+), 35 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 2c02e1e..529bb7d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4239,6 +4239,49 @@ static void intel_pmu_check_num_counters(int *num_counters,
*intel_ctrl |= fixed_mask << INTEL_PMC_IDX_FIXED;
}

+static void intel_pmu_check_event_constraints(struct event_constraint *event_constraints,
+ int num_counters,
+ int num_counters_fixed,
+ u64 intel_ctrl)
+{
+ struct event_constraint *c;
+
+ if (!event_constraints)
+ return;
+
+ /*
+ * event on fixed counter2 (REF_CYCLES) only works on this
+ * counter, so do not extend mask to generic counters
+ */
+ for_each_event_constraint(c, event_constraints) {
+ /*
+ * Don't extend the topdown slots and metrics
+ * events to the generic counters.
+ */
+ if (c->idxmsk64 & INTEL_PMC_MSK_TOPDOWN) {
+ /*
+ * Disable topdown slots and metrics events,
+ * if slots event is not in CPUID.
+ */
+ if (!(INTEL_PMC_MSK_FIXED_SLOTS & intel_ctrl))
+ c->idxmsk64 = 0;
+ c->weight = hweight64(c->idxmsk64);
+ continue;
+ }
+
+ if (c->cmask == FIXED_EVENT_FLAGS) {
+ /* Disabled fixed counters which are not in CPUID */
+ c->idxmsk64 &= intel_ctrl;
+
+ if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+ c->idxmsk64 |= (1ULL << num_counters) - 1;
+ }
+ c->idxmsk64 &=
+ ~(~0ULL << (INTEL_PMC_IDX_FIXED + num_counters_fixed));
+ c->weight = hweight64(c->idxmsk64);
+ }
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
@@ -5098,7 +5141,6 @@ __init int intel_pmu_init(void)
union cpuid10_edx edx;
union cpuid10_eax eax;
union cpuid10_ebx ebx;
- struct event_constraint *c;
unsigned int fixed_mask;
struct extra_reg *er;
bool pmem = false;
@@ -5736,40 +5778,10 @@ __init int intel_pmu_init(void)
if (x86_pmu.intel_cap.anythread_deprecated)
x86_pmu.format_attrs = intel_arch_formats_attr;

- if (x86_pmu.event_constraints) {
- /*
- * event on fixed counter2 (REF_CYCLES) only works on this
- * counter, so do not extend mask to generic counters
- */
- for_each_event_constraint(c, x86_pmu.event_constraints) {
- /*
- * Don't extend the topdown slots and metrics
- * events to the generic counters.
- */
- if (c->idxmsk64 & INTEL_PMC_MSK_TOPDOWN) {
- /*
- * Disable topdown slots and metrics events,
- * if slots event is not in CPUID.
- */
- if (!(INTEL_PMC_MSK_FIXED_SLOTS & x86_pmu.intel_ctrl))
- c->idxmsk64 = 0;
- c->weight = hweight64(c->idxmsk64);
- continue;
- }
-
- if (c->cmask == FIXED_EVENT_FLAGS) {
- /* Disabled fixed counters which are not in CPUID */
- c->idxmsk64 &= x86_pmu.intel_ctrl;
-
- if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
- c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
- }
- c->idxmsk64 &=
- ~(~0ULL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
- c->weight = hweight64(c->idxmsk64);
- }
- }
-
+ intel_pmu_check_event_constraints(x86_pmu.event_constraints,
+ x86_pmu.num_counters,
+ x86_pmu.num_counters_fixed,
+ x86_pmu.intel_ctrl);
/*
* Access LBR MSR may cause #GP under certain circumstances.
* E.g. KVM doesn't support LBR MSR
--
2.7.4

2021-02-08 18:06:09

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 08/49] perf/x86: Hybrid PMU support for event constraints

From: Kan Liang <[email protected]>

The events are different among hybrid PMUs. Each hybrid PMU should use
its own event constraints.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 3 ++-
arch/x86/events/intel/core.c | 5 +++--
arch/x86/events/intel/ds.c | 5 +++--
arch/x86/events/perf_event.h | 5 +++++
4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 27c87a7..2160142 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1514,6 +1514,7 @@ void perf_event_print_debug(void)
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
+ struct event_constraint *pebs_constraints = X86_HYBRID_READ_FROM_CPUC(pebs_constraints, cpuc);
unsigned long flags;
int idx;

@@ -1533,7 +1534,7 @@ void perf_event_print_debug(void)
pr_info("CPU#%d: status: %016llx\n", cpu, status);
pr_info("CPU#%d: overflow: %016llx\n", cpu, overflow);
pr_info("CPU#%d: fixed: %016llx\n", cpu, fixed);
- if (x86_pmu.pebs_constraints) {
+ if (pebs_constraints) {
rdmsrl(MSR_IA32_PEBS_ENABLE, pebs);
pr_info("CPU#%d: pebs: %016llx\n", cpu, pebs);
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9baa6b6..9acfa82 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3136,10 +3136,11 @@ struct event_constraint *
x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
struct perf_event *event)
{
+ struct event_constraint *event_constraints = X86_HYBRID_READ_FROM_CPUC(event_constraints, cpuc);
struct event_constraint *c;

- if (x86_pmu.event_constraints) {
- for_each_event_constraint(c, x86_pmu.event_constraints) {
+ if (event_constraints) {
+ for_each_event_constraint(c, event_constraints) {
if (constraint_match(c, event->hw.config)) {
event->hw.flags |= c->flags;
return c;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index a528966..ba651d9 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -959,13 +959,14 @@ struct event_constraint intel_spr_pebs_event_constraints[] = {

struct event_constraint *intel_pebs_constraints(struct perf_event *event)
{
+ struct event_constraint *pebs_constraints = X86_HYBRID_READ_FROM_EVENT(pebs_constraints, event);
struct event_constraint *c;

if (!event->attr.precise_ip)
return NULL;

- if (x86_pmu.pebs_constraints) {
- for_each_event_constraint(c, x86_pmu.pebs_constraints) {
+ if (pebs_constraints) {
+ for_each_event_constraint(c, pebs_constraints) {
if (constraint_match(c, event->hw.config)) {
event->hw.flags |= c->flags;
return c;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 00fcd92..7a5d036 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -661,6 +661,8 @@ struct x86_hybrid_pmu {
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];
+ struct event_constraint *event_constraints;
+ struct event_constraint *pebs_constraints;
};

#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
@@ -673,6 +675,9 @@ struct x86_hybrid_pmu {
#define X86_HYBRID_READ_FROM_CPUC(_name, _cpuc) \
(_cpuc && HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) ? x86_pmu.hybrid_pmu[(_cpuc)->hybrid_pmu_idx]._name : x86_pmu._name)

+#define X86_HYBRID_READ_FROM_EVENT(_name, _event) \
+ (IS_X86_HYBRID ? ((struct x86_hybrid_pmu *)(_event->pmu))->_name : x86_pmu._name)
+
/*
* struct x86_pmu - generic x86 pmu
*/
--
2.7.4

2021-02-08 18:06:10

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 32/49] perf header: Support HYBRID_TOPOLOGY feature

From: Jin Yao <[email protected]>

It would be useful to let user know the hybrid topology.
For example, the HYBRID_TOPOLOGY feature in header indicates which
cpus are core cpus, and which cpus are atom cpus.

With this patch,

On a hybrid platform:

root@otcpl-adl-s-2:~# ./perf report --header-only -I
...
# cpu_core cpu list : 0-15
# cpu_atom cpu list : 16-23

On a non-hybrid platform:

root@kbl-ppc:~# ./perf report --header-only -I
...
# missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY

It just shows HYBRID_TOPOLOGY is missing feature.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/cputopo.c | 80 +++++++++++++++++++++++++++++++++++++++++
tools/perf/util/cputopo.h | 13 +++++++
tools/perf/util/env.c | 6 ++++
tools/perf/util/env.h | 7 ++++
tools/perf/util/header.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/header.h | 1 +
tools/perf/util/pmu.c | 1 -
tools/perf/util/pmu.h | 1 +
8 files changed, 200 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
index 1b52402..4a00fb8 100644
--- a/tools/perf/util/cputopo.c
+++ b/tools/perf/util/cputopo.c
@@ -12,6 +12,7 @@
#include "cpumap.h"
#include "debug.h"
#include "env.h"
+#include "pmu.h"

#define CORE_SIB_FMT \
"%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
@@ -351,3 +352,82 @@ void numa_topology__delete(struct numa_topology *tp)

free(tp);
}
+
+static int load_hybrid_node(struct hybrid_topology_node *node,
+ struct perf_pmu *pmu)
+{
+ const char *sysfs;
+ char path[PATH_MAX];
+ char *buf = NULL, *p;
+ FILE *fp;
+ size_t len = 0;
+
+ node->pmu_name = strdup(pmu->name);
+ if (!node->pmu_name)
+ return -1;
+
+ sysfs = sysfs__mountpoint();
+ snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, pmu->name);
+
+ fp = fopen(path, "r");
+ if (!fp)
+ goto err;
+
+ if (getline(&buf, &len, fp) <= 0) {
+ fclose(fp);
+ goto err;
+ }
+
+ p = strchr(buf, '\n');
+ if (p)
+ *p = '\0';
+
+ fclose(fp);
+ node->cpus = buf;
+ return 0;
+
+err:
+ zfree(&node->pmu_name);
+ free(buf);
+ return -1;
+}
+
+struct hybrid_topology *hybrid_topology__new(void)
+{
+ struct perf_pmu *pmu;
+ struct hybrid_topology *tp = NULL;
+ u32 nr = 0, i = 0;
+
+ perf_pmu__for_each_hybrid_pmus(pmu)
+ nr++;
+
+ if (nr == 0)
+ return NULL;
+
+ tp = zalloc(sizeof(*tp) + sizeof(tp->nodes[0]) * nr);
+ if (!tp)
+ return NULL;
+
+ tp->nr = nr;
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ if (load_hybrid_node(&tp->nodes[i], pmu)) {
+ hybrid_topology__delete(tp);
+ return NULL;
+ }
+ i++;
+ }
+
+ return tp;
+}
+
+void hybrid_topology__delete(struct hybrid_topology *tp)
+{
+ u32 i;
+
+ for (i = 0; i < tp->nr; i++) {
+ zfree(&tp->nodes[i].pmu_name);
+ zfree(&tp->nodes[i].cpus);
+ }
+
+ free(tp);
+}
diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
index 6201c37..d9af971 100644
--- a/tools/perf/util/cputopo.h
+++ b/tools/perf/util/cputopo.h
@@ -25,10 +25,23 @@ struct numa_topology {
struct numa_topology_node nodes[];
};

+struct hybrid_topology_node {
+ char *pmu_name;
+ char *cpus;
+};
+
+struct hybrid_topology {
+ u32 nr;
+ struct hybrid_topology_node nodes[];
+};
+
struct cpu_topology *cpu_topology__new(void);
void cpu_topology__delete(struct cpu_topology *tp);

struct numa_topology *numa_topology__new(void);
void numa_topology__delete(struct numa_topology *tp);

+struct hybrid_topology *hybrid_topology__new(void);
+void hybrid_topology__delete(struct hybrid_topology *tp);
+
#endif /* __PERF_CPUTOPO_H */
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 9130f6f..9e05eca 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -202,6 +202,12 @@ void perf_env__exit(struct perf_env *env)
for (i = 0; i < env->nr_memory_nodes; i++)
zfree(&env->memory_nodes[i].set);
zfree(&env->memory_nodes);
+
+ for (i = 0; i < env->nr_hybrid_nodes; i++) {
+ perf_cpu_map__put(env->hybrid_nodes[i].map);
+ zfree(&env->hybrid_nodes[i].pmu_name);
+ }
+ zfree(&env->hybrid_nodes);
}

void perf_env__init(struct perf_env *env __maybe_unused)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index ca249bf..9ca7633 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -37,6 +37,11 @@ struct memory_node {
unsigned long *set;
};

+struct hybrid_node {
+ char *pmu_name;
+ struct perf_cpu_map *map;
+};
+
struct perf_env {
char *hostname;
char *os_release;
@@ -59,6 +64,7 @@ struct perf_env {
int nr_pmu_mappings;
int nr_groups;
int nr_cpu_pmu_caps;
+ int nr_hybrid_nodes;
char *cmdline;
const char **cmdline_argv;
char *sibling_cores;
@@ -77,6 +83,7 @@ struct perf_env {
struct numa_node *numa_nodes;
struct memory_node *memory_nodes;
unsigned long long memory_bsize;
+ struct hybrid_node *hybrid_nodes;
#ifdef HAVE_LIBBPF_SUPPORT
/*
* bpf_info_lock protects bpf rbtrees. This is needed because the
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index c4ed3dc..6bcd959 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -932,6 +932,40 @@ static int write_clock_data(struct feat_fd *ff,
return do_write(ff, data64, sizeof(*data64));
}

+static int write_hybrid_topology(struct feat_fd *ff,
+ struct evlist *evlist __maybe_unused)
+{
+ struct hybrid_topology *tp;
+ int ret;
+ u32 i;
+
+ tp = hybrid_topology__new();
+ if (!tp)
+ return -1;
+
+ ret = do_write(ff, &tp->nr, sizeof(u32));
+ if (ret < 0)
+ goto err;
+
+ for (i = 0; i < tp->nr; i++) {
+ struct hybrid_topology_node *n = &tp->nodes[i];
+
+ ret = do_write_string(ff, n->pmu_name);
+ if (ret < 0)
+ goto err;
+
+ ret = do_write_string(ff, n->cpus);
+ if (ret < 0)
+ goto err;
+ }
+
+ ret = 0;
+
+err:
+ hybrid_topology__delete(tp);
+ return ret;
+}
+
static int write_dir_format(struct feat_fd *ff,
struct evlist *evlist __maybe_unused)
{
@@ -1623,6 +1657,19 @@ static void print_clock_data(struct feat_fd *ff, FILE *fp)
clockid_name(clockid));
}

+static void print_hybrid_topology(struct feat_fd *ff, FILE *fp)
+{
+ int i;
+ struct hybrid_node *n;
+
+ for (i = 0; i < ff->ph->env.nr_hybrid_nodes; i++) {
+ n = &ff->ph->env.hybrid_nodes[i];
+
+ fprintf(fp, "# %s cpu list : ", n->pmu_name);
+ cpu_map__fprintf(n->map, fp);
+ }
+}
+
static void print_dir_format(struct feat_fd *ff, FILE *fp)
{
struct perf_session *session;
@@ -2849,6 +2896,50 @@ static int process_clock_data(struct feat_fd *ff,
return 0;
}

+static int process_hybrid_topology(struct feat_fd *ff,
+ void *data __maybe_unused)
+{
+ struct hybrid_node *nodes, *n;
+ u32 nr, i;
+ char *str;
+
+ /* nr nodes */
+ if (do_read_u32(ff, &nr))
+ return -1;
+
+ nodes = zalloc(sizeof(*nodes) * nr);
+ if (!nodes)
+ return -ENOMEM;
+
+ for (i = 0; i < nr; i++) {
+ n = &nodes[i];
+
+ n->pmu_name = do_read_string(ff);
+ if (!n->pmu_name)
+ goto error;
+
+ str = do_read_string(ff);
+ if (!str)
+ goto error;
+
+ n->map = perf_cpu_map__new(str);
+ free(str);
+ if (!n->map)
+ goto error;
+ }
+
+ ff->ph->env.nr_hybrid_nodes = nr;
+ ff->ph->env.hybrid_nodes = nodes;
+ return 0;
+
+error:
+ for (i = 0; i < nr; i++)
+ free(nodes[i].pmu_name);
+
+ free(nodes);
+ return -1;
+}
+
static int process_dir_format(struct feat_fd *ff,
void *_data __maybe_unused)
{
@@ -3117,6 +3208,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
FEAT_OPR(COMPRESSED, compressed, false),
FEAT_OPR(CPU_PMU_CAPS, cpu_pmu_caps, false),
FEAT_OPR(CLOCK_DATA, clock_data, false),
+ FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
};

struct header_print_data {
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 2aca717..3f12ec0 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -45,6 +45,7 @@ enum {
HEADER_COMPRESSED,
HEADER_CPU_PMU_CAPS,
HEADER_CLOCK_DATA,
+ HEADER_HYBRID_TOPOLOGY,
HEADER_LAST_FEATURE,
HEADER_FEAT_BITS = 256,
};
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 9a6c973..ca2fc67 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -607,7 +607,6 @@ static struct perf_cpu_map *__pmu_cpumask(const char *path)
*/
#define SYS_TEMPLATE_ID "./bus/event_source/devices/%s/identifier"
#define CPUS_TEMPLATE_UNCORE "%s/bus/event_source/devices/%s/cpumask"
-#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"

static struct perf_cpu_map *pmu_cpumask(const char *name)
{
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 5b727cf..ccffc05 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -20,6 +20,7 @@ enum {

#define PERF_PMU_FORMAT_BITS 64
#define EVENT_SOURCE_DEVICE_PATH "/bus/event_source/devices/"
+#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"

struct perf_event_attr;

--
2.7.4

2021-02-08 18:06:38

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 10/49] perf/x86/intel: Factor out intel_pmu_check_num_counters

From: Kan Liang <[email protected]>

Each Hybrid PMU has to check its own number of counters and mask fixed
counters before registration.

The intel_pmu_check_num_counters will be reused later when registering a
dedicated hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 582d191..2c02e1e 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4219,6 +4219,26 @@ static void flip_smm_bit(void *data)
}
}

+static void intel_pmu_check_num_counters(int *num_counters,
+ int *num_counters_fixed,
+ u64 *intel_ctrl, u64 fixed_mask)
+{
+ if (*num_counters > INTEL_PMC_MAX_GENERIC) {
+ WARN(1, KERN_ERR "hw perf events %d > max(%d), clipping!",
+ *num_counters, INTEL_PMC_MAX_GENERIC);
+ *num_counters = INTEL_PMC_MAX_GENERIC;
+ }
+ *intel_ctrl = (1ULL << *num_counters) - 1;
+
+ if (*num_counters_fixed > INTEL_PMC_MAX_FIXED) {
+ WARN(1, KERN_ERR "hw perf events fixed %d > max(%d), clipping!",
+ *num_counters_fixed, INTEL_PMC_MAX_FIXED);
+ *num_counters_fixed = INTEL_PMC_MAX_FIXED;
+ }
+
+ *intel_ctrl |= fixed_mask << INTEL_PMC_IDX_FIXED;
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
@@ -5707,20 +5727,10 @@ __init int intel_pmu_init(void)

x86_pmu.attr_update = attr_update;

- if (x86_pmu.num_counters > INTEL_PMC_MAX_GENERIC) {
- WARN(1, KERN_ERR "hw perf events %d > max(%d), clipping!",
- x86_pmu.num_counters, INTEL_PMC_MAX_GENERIC);
- x86_pmu.num_counters = INTEL_PMC_MAX_GENERIC;
- }
- x86_pmu.intel_ctrl = (1ULL << x86_pmu.num_counters) - 1;
-
- if (x86_pmu.num_counters_fixed > INTEL_PMC_MAX_FIXED) {
- WARN(1, KERN_ERR "hw perf events fixed %d > max(%d), clipping!",
- x86_pmu.num_counters_fixed, INTEL_PMC_MAX_FIXED);
- x86_pmu.num_counters_fixed = INTEL_PMC_MAX_FIXED;
- }
-
- x86_pmu.intel_ctrl |= (u64)fixed_mask << INTEL_PMC_IDX_FIXED;
+ intel_pmu_check_num_counters(&x86_pmu.num_counters,
+ &x86_pmu.num_counters_fixed,
+ &x86_pmu.intel_ctrl,
+ (u64)fixed_mask);

/* AnyThread may be deprecated on arch perfmon v5 or later */
if (x86_pmu.intel_cap.anythread_deprecated)
--
2.7.4

2021-02-08 18:07:33

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 12/49] perf/x86/intel: Factor out intel_pmu_check_extra_regs

From: Kan Liang <[email protected]>

Each Hybrid PMU has to check and update its own extra registers before
registration.

The intel_pmu_check_extra_regs will be reused later when registering a
dedicated hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 37 +++++++++++++++++++++++--------------
1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 529bb7d..559b4e9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4282,6 +4282,28 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
}
}

+static bool check_msr(unsigned long msr, u64 mask);
+
+static void intel_pmu_check_extra_regs(struct extra_reg *extra_regs)
+{
+ struct extra_reg *er;
+
+ /*
+ * Access extra MSR may cause #GP under certain circumstances.
+ * E.g. KVM doesn't support offcore event
+ * Check all extra_regs here.
+ */
+ if (!extra_regs)
+ return;
+
+ for (er = extra_regs; er->msr; er++) {
+ er->extra_msr_access = check_msr(er->msr, 0x11UL);
+ /* Disable LBR select mapping */
+ if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
+ x86_pmu.lbr_sel_map = NULL;
+ }
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
@@ -5142,7 +5164,6 @@ __init int intel_pmu_init(void)
union cpuid10_eax eax;
union cpuid10_ebx ebx;
unsigned int fixed_mask;
- struct extra_reg *er;
bool pmem = false;
int version, i;
char *name;
@@ -5799,19 +5820,7 @@ __init int intel_pmu_init(void)
if (x86_pmu.lbr_nr)
pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);

- /*
- * Access extra MSR may cause #GP under certain circumstances.
- * E.g. KVM doesn't support offcore event
- * Check all extra_regs here.
- */
- if (x86_pmu.extra_regs) {
- for (er = x86_pmu.extra_regs; er->msr; er++) {
- er->extra_msr_access = check_msr(er->msr, 0x11UL);
- /* Disable LBR select mapping */
- if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
- x86_pmu.lbr_sel_map = NULL;
- }
- }
+ intel_pmu_check_extra_regs(x86_pmu.extra_regs);

/* Support full width counters using alternative MSR range */
if (x86_pmu.intel_cap.full_width_write) {
--
2.7.4

2021-02-08 18:08:02

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 13/49] perf/x86: Expose check_hw_exists

From: Kan Liang <[email protected]>

Hybrid PMUs have a different number of counters. Each Hybrid PMU has to
check its own HW existence before registration.

Expose check_hw_exists, and add number of counters as parameters.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 10 +++++-----
arch/x86/events/perf_event.h | 2 ++
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 6857934..29dee3f 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -239,7 +239,7 @@ static void release_pmc_hardware(void) {}

#endif

-static bool check_hw_exists(void)
+bool check_hw_exists(int num_counters, int num_counters_fixed)
{
u64 val, val_fail = -1, val_new= ~0;
int i, reg, reg_fail = -1, ret = 0;
@@ -250,7 +250,7 @@ static bool check_hw_exists(void)
* Check to see if the BIOS enabled any of the counters, if so
* complain and bail.
*/
- for (i = 0; i < x86_pmu.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
reg = x86_pmu_config_addr(i);
ret = rdmsrl_safe(reg, &val);
if (ret)
@@ -264,12 +264,12 @@ static bool check_hw_exists(void)
}
}

- if (x86_pmu.num_counters_fixed) {
+ if (num_counters_fixed) {
reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
ret = rdmsrl_safe(reg, &val);
if (ret)
goto msr_fail;
- for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
+ for (i = 0; i < num_counters_fixed; i++) {
if (fixed_counter_disabled(i, NULL))
continue;
if (val & (0x03 << i*4)) {
@@ -2012,7 +2012,7 @@ static int __init init_hw_perf_events(void)
pmu_check_apic();

/* sanity check that the hardware exists or is emulated */
- if (!check_hw_exists())
+ if (!check_hw_exists(x86_pmu.num_counters, x86_pmu.num_counters_fixed))
return 0;

pr_cont("%s PMU driver.\n", x86_pmu.name);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 109139c..560410c 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1024,6 +1024,8 @@ static inline int x86_pmu_rdpmc_index(int index)
return x86_pmu.rdpmc_index ? x86_pmu.rdpmc_index(index) : index;
}

+bool check_hw_exists(int num_counters, int num_counters_fixed);
+
int x86_add_exclusive(unsigned int what);

void x86_del_exclusive(unsigned int what);
--
2.7.4

2021-02-08 18:08:13

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 33/49] perf header: Support hybrid CPU_PMU_CAPS

From: Jin Yao <[email protected]>

On hybrid platform, it may have several cpu pmus, such as,
"cpu_core" and "cpu_atom". The CPU_PMU_CAPS feature in perf
header needs to be improved to support multiple cpu pmus.

The new layout in header is:

<nr_caps>
<caps string>
<caps string>
<pmu name>
<nr of rest pmus>

It's also considered to be compatible with old perf.data.

With this patch,

On hybrid platform with new perf.data

root@otcpl-adl-s-2:~# ./perf report --header-only -I
...
# cpu_core pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid
# cpu_atom pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid

On hybrid platform with old perf.data

root@otcpl-adl-s-2:~# ./perf report --header-only -I
...
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=alderlake_hybrid

On non-hybrid platform with old perf.data

root@kbl-ppc:~# ./perf report --header-only -I
...
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake

On non-hybrid platform with new perf.data

root@kbl-ppc:~# ./perf report --header-only -I
...
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake

It's also tested for old perf with new per.data.

root@kbl-ppc:~# perf report --header-only -I
...
# cpu pmu capabilities: branches=32, max_precise=3, pmu_name=skylake

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/env.c | 6 ++
tools/perf/util/env.h | 11 ++-
tools/perf/util/header.c | 175 +++++++++++++++++++++++++++++++++++++++++------
3 files changed, 168 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 9e05eca..8ef24aa 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -208,6 +208,12 @@ void perf_env__exit(struct perf_env *env)
zfree(&env->hybrid_nodes[i].pmu_name);
}
zfree(&env->hybrid_nodes);
+
+ for (i = 0; i < env->nr_cpu_pmu_caps_nodes; i++) {
+ zfree(&env->cpu_pmu_caps_nodes[i].cpu_pmu_caps);
+ zfree(&env->cpu_pmu_caps_nodes[i].pmu_name);
+ }
+ zfree(&env->cpu_pmu_caps_nodes);
}

void perf_env__init(struct perf_env *env __maybe_unused)
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 9ca7633..5552c98 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -42,6 +42,13 @@ struct hybrid_node {
struct perf_cpu_map *map;
};

+struct cpu_pmu_caps_node {
+ int nr_cpu_pmu_caps;
+ unsigned int max_branches;
+ char *cpu_pmu_caps;
+ char *pmu_name;
+};
+
struct perf_env {
char *hostname;
char *os_release;
@@ -63,15 +70,14 @@ struct perf_env {
int nr_memory_nodes;
int nr_pmu_mappings;
int nr_groups;
- int nr_cpu_pmu_caps;
int nr_hybrid_nodes;
+ int nr_cpu_pmu_caps_nodes;
char *cmdline;
const char **cmdline_argv;
char *sibling_cores;
char *sibling_dies;
char *sibling_threads;
char *pmu_mappings;
- char *cpu_pmu_caps;
struct cpu_topology_map *cpu;
struct cpu_cache_level *caches;
int caches_cnt;
@@ -84,6 +90,7 @@ struct perf_env {
struct memory_node *memory_nodes;
unsigned long long memory_bsize;
struct hybrid_node *hybrid_nodes;
+ struct cpu_pmu_caps_node *cpu_pmu_caps_nodes;
#ifdef HAVE_LIBBPF_SUPPORT
/*
* bpf_info_lock protects bpf rbtrees. This is needed because the
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6bcd959..b161ce3 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1459,18 +1459,22 @@ static int write_compressed(struct feat_fd *ff __maybe_unused,
return do_write(ff, &(ff->ph->env.comp_mmap_len), sizeof(ff->ph->env.comp_mmap_len));
}

-static int write_cpu_pmu_caps(struct feat_fd *ff,
- struct evlist *evlist __maybe_unused)
+static int write_per_cpu_pmu_caps(struct feat_fd *ff, struct perf_pmu *pmu,
+ int nr)
{
- struct perf_pmu *cpu_pmu = perf_pmu__find("cpu");
struct perf_pmu_caps *caps = NULL;
int nr_caps;
int ret;

- if (!cpu_pmu)
- return -ENOENT;
-
- nr_caps = perf_pmu__caps_parse(cpu_pmu);
+ /*
+ * The layout is:
+ * <nr_caps>
+ * <caps string>
+ * <caps string>
+ * <pmu name>
+ * <nr of rest pmus>
+ */
+ nr_caps = perf_pmu__caps_parse(pmu);
if (nr_caps < 0)
return nr_caps;

@@ -1478,7 +1482,7 @@ static int write_cpu_pmu_caps(struct feat_fd *ff,
if (ret < 0)
return ret;

- list_for_each_entry(caps, &cpu_pmu->caps, list) {
+ list_for_each_entry(caps, &pmu->caps, list) {
ret = do_write_string(ff, caps->name);
if (ret < 0)
return ret;
@@ -1488,9 +1492,50 @@ static int write_cpu_pmu_caps(struct feat_fd *ff,
return ret;
}

+ ret = do_write_string(ff, pmu->name);
+ if (ret < 0)
+ return ret;
+
+ ret = do_write(ff, &nr, sizeof(nr));
+ if (ret < 0)
+ return ret;
+
return ret;
}

+static int write_cpu_pmu_caps(struct feat_fd *ff,
+ struct evlist *evlist __maybe_unused)
+{
+ struct perf_pmu *pmu = perf_pmu__find("cpu");
+ u32 nr = 0;
+ int ret;
+
+ if (pmu)
+ nr = 1;
+ else {
+ perf_pmu__for_each_hybrid_pmus(pmu)
+ nr++;
+ pmu = NULL;
+ }
+
+ if (nr == 0)
+ return -1;
+
+ if (pmu) {
+ ret = write_per_cpu_pmu_caps(ff, pmu, 0);
+ if (ret < 0)
+ return ret;
+ } else {
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ ret = write_per_cpu_pmu_caps(ff, pmu, --nr);
+ if (ret < 0)
+ return ret;
+ }
+ }
+
+ return 0;
+}
+
static void print_hostname(struct feat_fd *ff, FILE *fp)
{
fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
@@ -1963,18 +2008,28 @@ static void print_compressed(struct feat_fd *ff, FILE *fp)
ff->ph->env.comp_level, ff->ph->env.comp_ratio);
}

-static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
+static void print_per_cpu_pmu_caps(FILE *fp, struct cpu_pmu_caps_node *n)
{
- const char *delimiter = "# cpu pmu capabilities: ";
- u32 nr_caps = ff->ph->env.nr_cpu_pmu_caps;
- char *str;
+ const char *delimiter;
+ u32 nr_caps = n->nr_cpu_pmu_caps;
+ char *str, buf[128];

if (!nr_caps) {
- fprintf(fp, "# cpu pmu capabilities: not available\n");
+ if (!n->pmu_name)
+ fprintf(fp, "# cpu pmu capabilities: not available\n");
+ else
+ fprintf(fp, "# %s pmu capabilities: not available\n", n->pmu_name);
return;
}

- str = ff->ph->env.cpu_pmu_caps;
+ if (!n->pmu_name)
+ scnprintf(buf, sizeof(buf), "# cpu pmu capabilities: ");
+ else
+ scnprintf(buf, sizeof(buf), "# %s pmu capabilities: ", n->pmu_name);
+
+ delimiter = buf;
+
+ str = n->cpu_pmu_caps;
while (nr_caps--) {
fprintf(fp, "%s%s", delimiter, str);
delimiter = ", ";
@@ -1984,6 +2039,17 @@ static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
fprintf(fp, "\n");
}

+static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
+{
+ struct cpu_pmu_caps_node *n;
+ int i;
+
+ for (i = 0; i < ff->ph->env.nr_cpu_pmu_caps_nodes; i++) {
+ n = &ff->ph->env.cpu_pmu_caps_nodes[i];
+ print_per_cpu_pmu_caps(fp, n);
+ }
+}
+
static void print_pmu_mappings(struct feat_fd *ff, FILE *fp)
{
const char *delimiter = "# pmu mappings: ";
@@ -3093,13 +3159,14 @@ static int process_compressed(struct feat_fd *ff,
return 0;
}

-static int process_cpu_pmu_caps(struct feat_fd *ff,
- void *data __maybe_unused)
+static int process_cpu_pmu_caps_node(struct feat_fd *ff,
+ struct cpu_pmu_caps_node *n, bool *end)
{
- char *name, *value;
+ char *name, *value, *pmu_name;
struct strbuf sb;
- u32 nr_caps;
+ u32 nr_caps, nr;

+ *end = false;
if (do_read_u32(ff, &nr_caps))
return -1;

@@ -3108,7 +3175,7 @@ static int process_cpu_pmu_caps(struct feat_fd *ff,
return 0;
}

- ff->ph->env.nr_cpu_pmu_caps = nr_caps;
+ n->nr_cpu_pmu_caps = nr_caps;

if (strbuf_init(&sb, 128) < 0)
return -1;
@@ -3129,13 +3196,33 @@ static int process_cpu_pmu_caps(struct feat_fd *ff,
if (strbuf_add(&sb, "", 1) < 0)
goto free_value;

- if (!strcmp(name, "branches"))
- ff->ph->env.max_branches = atoi(value);
+ if (!strcmp(name, "branches")) {
+ n->max_branches = atoi(value);
+ if (n->max_branches > ff->ph->env.max_branches)
+ ff->ph->env.max_branches = n->max_branches;
+ }

free(value);
free(name);
}
- ff->ph->env.cpu_pmu_caps = strbuf_detach(&sb, NULL);
+
+ /*
+ * Old perf.data may not have pmu_name,
+ */
+ pmu_name = do_read_string(ff);
+ if (!pmu_name || strncmp(pmu_name, "cpu_", 4)) {
+ *end = true;
+ goto out;
+ }
+
+ if (do_read_u32(ff, &nr))
+ return -1;
+
+ if (nr == 0)
+ *end = true;
+out:
+ n->cpu_pmu_caps = strbuf_detach(&sb, NULL);
+ n->pmu_name = pmu_name;
return 0;

free_value:
@@ -3147,6 +3234,50 @@ static int process_cpu_pmu_caps(struct feat_fd *ff,
return -1;
}

+static int process_cpu_pmu_caps(struct feat_fd *ff,
+ void *data __maybe_unused)
+{
+ struct cpu_pmu_caps_node *nodes = NULL, *tmp;
+ int ret, i, nr_alloc = 1, nr_used = 0;
+ bool end;
+
+ while (1) {
+ if (nr_used == nr_alloc || !nodes) {
+ nr_alloc *= 2;
+ tmp = realloc(nodes, sizeof(*nodes) * nr_alloc);
+ if (!tmp)
+ return -ENOMEM;
+ memset(tmp + nr_used, 0,
+ sizeof(*nodes) * (nr_alloc - nr_used));
+ nodes = tmp;
+ }
+
+ ret = process_cpu_pmu_caps_node(ff, &nodes[nr_used], &end);
+ if (ret) {
+ if (nr_used)
+ break;
+ goto err;
+ }
+
+ nr_used++;
+ if (end)
+ break;
+ }
+
+ ff->ph->env.nr_cpu_pmu_caps_nodes = (u32)nr_used;
+ ff->ph->env.cpu_pmu_caps_nodes = nodes;
+ return 0;
+
+err:
+ for (i = 0; i < nr_used; i++) {
+ free(nodes[i].cpu_pmu_caps);
+ free(nodes[i].pmu_name);
+ }
+
+ free(nodes);
+ return ret;
+}
+
#define FEAT_OPR(n, func, __full_only) \
[HEADER_##n] = { \
.name = __stringify(n), \
--
2.7.4

2021-02-08 18:09:21

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 14/49] perf/x86: Remove temporary pmu assignment in event_init

From: Kan Liang <[email protected]>

The temporary pmu assignment in event_init is unnecessary.

The assignment was introduced by commit 8113070d6639 ("perf_events:
Add fast-path to the rescheduling code"). At that time, event->pmu is
not assigned yet when initializing an event. The assignment is required.
However, from commit 7e5b2a01d2ca ("perf: provide PMU when initing
events"), the event->pmu is provided before event_init is invoked.
The temporary pmu assignment in event_init should be removed.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 11 -----------
1 file changed, 11 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 29dee3f..bdcd3ad 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2294,7 +2294,6 @@ static int validate_group(struct perf_event *event)

static int x86_pmu_event_init(struct perf_event *event)
{
- struct pmu *tmp;
int err;

switch (event->attr.type) {
@@ -2309,20 +2308,10 @@ static int x86_pmu_event_init(struct perf_event *event)

err = __x86_pmu_event_init(event);
if (!err) {
- /*
- * we temporarily connect event to its pmu
- * such that validate_group() can classify
- * it as an x86 event using is_x86_event()
- */
- tmp = event->pmu;
- event->pmu = &pmu;
-
if (event->group_leader != event)
err = validate_group(event);
else
err = validate_event(event);
-
- event->pmu = tmp;
}
if (err) {
if (event->destroy)
--
2.7.4

2021-02-08 18:09:21

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 39/49] perf parse-events: Support hybrid raw events

From: Jin Yao <[email protected]>

On hybrid platform, same raw event is possible to be available on
both cpu_core pmu and cpu_atom pmu. So it's supported to create
two raw events for one event encoding.

root@otcpl-adl-s-2:~# ./perf stat -e r3c -a -vv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 4
size 120
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
------------------------------------------------------------
perf_event_attr:
type 10
size 120
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
...

Performance counter stats for 'system wide':

13,107,070 r3c
316,562 r3c

1.002161379 seconds time elapsed

It also supports the raw event inside pmu. Syntax is similar:

cpu_core/<raw event>/
cpu_atom/<raw event>/

root@otcpl-adl-s-2:~# ./perf stat -e cpu_core/r3c/ -vv -- ./triad_loop
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 4
size 120
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 23641 cpu -1 group_fd -1 flags 0x8 = 3
cpu_core/r3c/: 0: 401407363 102724005 102724005
cpu_core/r3c/: 401407363 102724005 102724005

Performance counter stats for './triad_loop':

401,407,363 cpu_core/r3c/

0.103186241 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++++++-
1 file changed, 55 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index ddf6f79..6d7a2ce 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1532,6 +1532,55 @@ static int add_hybrid_numeric(struct parse_events_state *parse_state,
return 0;
}

+static int create_hybrid_raw_event(struct parse_events_state *parse_state,
+ struct list_head *list,
+ struct perf_event_attr *attr,
+ struct list_head *head_config,
+ struct list_head *config_terms,
+ struct perf_pmu *pmu)
+{
+ struct evsel *evsel;
+
+ attr->type = pmu->type;
+ evsel = __add_event(list, &parse_state->idx, attr, true,
+ get_config_name(head_config),
+ pmu, config_terms, false, NULL);
+ if (evsel)
+ evsel->pmu_name = strdup(pmu->name);
+ else
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int add_hybrid_raw(struct parse_events_state *parse_state,
+ struct list_head *list,
+ struct perf_event_attr *attr,
+ struct list_head *head_config,
+ struct list_head *config_terms,
+ bool *hybrid)
+{
+ struct perf_pmu *pmu;
+ int ret;
+
+ *hybrid = false;
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ *hybrid = true;
+ if (parse_state->pmu_name &&
+ strcmp(parse_state->pmu_name, pmu->name)) {
+ continue;
+ }
+
+ ret = create_hybrid_raw_event(parse_state, list, attr,
+ head_config, config_terms,
+ pmu);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
int parse_events_add_numeric(struct parse_events_state *parse_state,
struct list_head *list,
u32 type, u64 config,
@@ -1558,7 +1607,12 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
/*
* Skip the software dummy event.
*/
- if (type != PERF_TYPE_SOFTWARE) {
+ if (type == PERF_TYPE_RAW) {
+ ret = add_hybrid_raw(parse_state, list, &attr, head_config,
+ &config_terms, &hybrid);
+ if (hybrid)
+ return ret;
+ } else if (type != PERF_TYPE_SOFTWARE) {
if (!perf_pmu__hybrid_exist())
perf_pmu__scan(NULL);

--
2.7.4

2021-02-08 18:10:23

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 37/49] perf parse-events: Support hardware events inside PMU

From: Jin Yao <[email protected]>

On hybrid platform, some hardware cache events are only available
on a specific pmu. For example, 'L1-dcache-load-misses' is only
available on 'cpu_core' pmu. And even for the event which can be
available on both pmus, the user also may want to just enable
one event. So now following syntax is supported:

cpu_core/<hardware event>/
cpu_core/<hardware cache event>/
cpu_core/<pmu event>/

cpu_atom/<hardware event>/
cpu_atom/<hardware cache event>/
cpu_atom/<pmu event>/

It limits the event to be enabled only on a specified pmu.

The patch uses this idea, for example, if we use "cpu_core/LLC-loads/",
in parse_events_add_pmu(), term->config is "LLC-loads".

We create a new "parse_events_state" with the pmu_name and use
parse_events__scanner to scan the term->config (the string "LLC-loads"
in this example). The parse_events_add_cache() will be called during
parsing. The parse_state->pmu_name is used to identify the pmu
where the event is enabled.

For example,

root@otcpl-adl-s-2:~# ./perf stat -e cpu_core/cycles/,cpu_core/LLC-loads/ -vv -- ./triad_loop
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 29207 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 7
size 120
config 0x400000002
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 29207 cpu -1 group_fd -1 flags 0x8 = 4
cycles: 0: 401363820 101974864 101974864
LLC-loads: 0: 2577 101974864 101974864
cycles: 401363820 101974864 101974864
LLC-loads: 2577 101974864 101974864

Performance counter stats for './triad_loop':

401,363,820 cycles
2,577 LLC-loads

0.102416870 seconds time elapsed

root@otcpl-adl-s-2:~# ./perf stat -e cpu_atom/cycles/,cpu_atom/LLC-loads/ -vv -- taskset -c 16 ./triad_loop
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 29212 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 7
size 120
config 0xa00000002
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 29212 cpu -1 group_fd -1 flags 0x8 = 4
cycles: 0: 602052607 201353578 200990459
LLC-loads: 0: 4428 201353578 200990459
cycles: 603140304 201353578 200990459
LLC-loads: 4435 201353578 200990459

Performance counter stats for 'taskset -c 16 ./triad_loop':

603,140,304 cycles (99.82%)
4,435 LLC-loads (99.82%)

0.203948454 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/parse-events.c | 100 ++++++++++++++++++++++++++++++++++++++---
tools/perf/util/parse-events.h | 6 ++-
tools/perf/util/parse-events.y | 21 +++------
3 files changed, 105 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 28d356e..bba7db3 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -489,7 +489,8 @@ static int create_hybrid_cache_event(struct list_head *list, int *idx,
static int add_hybrid_cache(struct list_head *list, int *idx,
struct perf_event_attr *attr, char *name,
struct list_head *config_terms,
- bool *hybrid)
+ bool *hybrid,
+ struct parse_events_state *parse_state)
{
struct perf_pmu *pmu;
int ret;
@@ -497,6 +498,11 @@ static int add_hybrid_cache(struct list_head *list, int *idx,
*hybrid = false;
perf_pmu__for_each_hybrid_pmus(pmu) {
*hybrid = true;
+ if (parse_state->pmu_name &&
+ strcmp(parse_state->pmu_name, pmu->name)) {
+ continue;
+ }
+
ret = create_hybrid_cache_event(list, idx, attr, name,
config_terms, pmu);
if (ret)
@@ -509,7 +515,8 @@ static int add_hybrid_cache(struct list_head *list, int *idx,
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2,
struct parse_events_error *err,
- struct list_head *head_config)
+ struct list_head *head_config,
+ struct parse_events_state *parse_state)
{
struct perf_event_attr attr;
LIST_HEAD(config_terms);
@@ -582,7 +589,7 @@ int parse_events_add_cache(struct list_head *list, int *idx,
perf_pmu__scan(NULL);

ret = add_hybrid_cache(list, idx, &attr, config_name ? : name,
- &config_terms, &hybrid);
+ &config_terms, &hybrid, parse_state);
if (hybrid)
return ret;

@@ -1512,6 +1519,11 @@ static int add_hybrid_numeric(struct parse_events_state *parse_state,
*hybrid = false;
perf_pmu__for_each_hybrid_pmus(pmu) {
*hybrid = true;
+ if (parse_state->pmu_name &&
+ strcmp(parse_state->pmu_name, pmu->name)) {
+ continue;
+ }
+
ret = create_hybrid_hw_event(parse_state, list, attr, pmu);
if (ret)
return ret;
@@ -1578,6 +1590,10 @@ static bool config_term_percore(struct list_head *config_terms)
return false;
}

+static int parse_events_with_hybrid_pmu(struct parse_events_state *parse_state,
+ const char *str, char *name, bool *found,
+ struct list_head *list);
+
int parse_events_add_pmu(struct parse_events_state *parse_state,
struct list_head *list, char *name,
struct list_head *head_config,
@@ -1589,7 +1605,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
struct perf_pmu *pmu;
struct evsel *evsel;
struct parse_events_error *err = parse_state->error;
- bool use_uncore_alias;
+ bool use_uncore_alias, found;
LIST_HEAD(config_terms);

if (verbose > 1) {
@@ -1605,6 +1621,22 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
fprintf(stderr, "' that may result in non-fatal errors\n");
}

+ if (head_config && perf_pmu__is_hybrid(name)) {
+ struct parse_events_term *term;
+ int ret;
+
+ list_for_each_entry(term, head_config, list) {
+ if (!term->config)
+ continue;
+ ret = parse_events_with_hybrid_pmu(parse_state,
+ term->config,
+ name, &found,
+ list);
+ if (found)
+ return ret;
+ }
+ }
+
pmu = parse_state->fake_pmu ?: perf_pmu__find(name);
if (!pmu) {
char *err_str;
@@ -1713,12 +1745,19 @@ int parse_events_multi_pmu_add(struct parse_events_state *parse_state,
struct perf_pmu *pmu = NULL;
int ok = 0;

+ if (parse_state->pmu_name) {
+ list = alloc_list();
+ if (!list)
+ return -1;
+ *listp = list;
+ return 0;
+ }
+
*listp = NULL;
/* Add it for all PMUs that support the alias */
- list = malloc(sizeof(struct list_head));
+ list = alloc_list();
if (!list)
return -1;
- INIT_LIST_HEAD(list);
while ((pmu = perf_pmu__scan(pmu)) != NULL) {
struct perf_pmu_alias *alias;

@@ -2284,6 +2323,44 @@ int parse_events_terms(struct list_head *terms, const char *str)
return ret;
}

+static int list_num(struct list_head *list)
+{
+ struct list_head *pos;
+ int n = 0;
+
+ list_for_each(pos, list)
+ n++;
+
+ return n;
+}
+
+static int parse_events_with_hybrid_pmu(struct parse_events_state *parse_state,
+ const char *str, char *pmu_name,
+ bool *found, struct list_head *list)
+{
+ struct parse_events_state ps = {
+ .list = LIST_HEAD_INIT(ps.list),
+ .stoken = PE_START_EVENTS,
+ .pmu_name = pmu_name,
+ .idx = parse_state->idx,
+ };
+ int ret;
+
+ *found = false;
+ ret = parse_events__scanner(str, &ps);
+ perf_pmu__parse_cleanup();
+
+ if (!ret) {
+ if (!list_empty(&ps.list)) {
+ *found = true;
+ list_splice(&ps.list, list);
+ parse_state->idx = list_num(list);
+ }
+ }
+
+ return ret;
+}
+
int __parse_events(struct evlist *evlist, const char *str,
struct parse_events_error *err, struct perf_pmu *fake_pmu)
{
@@ -3309,3 +3386,14 @@ char *parse_events_formats_error_string(char *additional_terms)
fail:
return NULL;
}
+
+struct list_head *alloc_list(void)
+{
+ struct list_head *list = malloc(sizeof(*list));
+
+ if (!list)
+ return NULL;
+
+ INIT_LIST_HEAD(list);
+ return list;
+}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index b875485..6c91abc 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -138,6 +138,7 @@ struct parse_events_state {
struct list_head *terms;
int stoken;
struct perf_pmu *fake_pmu;
+ char *pmu_name;
};

void parse_events__handle_error(struct parse_events_error *err, int idx,
@@ -188,7 +189,8 @@ int parse_events_add_tool(struct parse_events_state *parse_state,
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2,
struct parse_events_error *error,
- struct list_head *head_config);
+ struct list_head *head_config,
+ struct parse_events_state *parse_state);
int parse_events_add_breakpoint(struct list_head *list, int *idx,
u64 addr, char *type, u64 len);
int parse_events_add_pmu(struct parse_events_state *parse_state,
@@ -243,6 +245,8 @@ char *parse_events_formats_error_string(char *additional_terms);
void parse_events_print_error(struct parse_events_error *err,
const char *event);

+struct list_head *alloc_list(void);
+
#ifdef HAVE_LIBELF_SUPPORT
/*
* If the probe point starts with '%',
diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
index d5b6aff..137c7fa 100644
--- a/tools/perf/util/parse-events.y
+++ b/tools/perf/util/parse-events.y
@@ -26,18 +26,6 @@ do { \
YYABORT; \
} while (0)

-static struct list_head* alloc_list(void)
-{
- struct list_head *list;
-
- list = malloc(sizeof(*list));
- if (!list)
- return NULL;
-
- INIT_LIST_HEAD(list);
- return list;
-}
-
static void free_list_evsel(struct list_head* list_evsel)
{
struct evsel *evsel, *tmp;
@@ -450,7 +438,8 @@ PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_e

list = alloc_list();
ABORT_ON(!list);
- err = parse_events_add_cache(list, &parse_state->idx, $1, $3, $5, error, $6);
+ err = parse_events_add_cache(list, &parse_state->idx, $1, $3, $5, error, $6,
+ parse_state);
parse_events_terms__delete($6);
free($1);
free($3);
@@ -471,7 +460,8 @@ PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT opt_event_config

list = alloc_list();
ABORT_ON(!list);
- err = parse_events_add_cache(list, &parse_state->idx, $1, $3, NULL, error, $4);
+ err = parse_events_add_cache(list, &parse_state->idx, $1, $3, NULL, error, $4,
+ parse_state);
parse_events_terms__delete($4);
free($1);
free($3);
@@ -491,7 +481,8 @@ PE_NAME_CACHE_TYPE opt_event_config

list = alloc_list();
ABORT_ON(!list);
- err = parse_events_add_cache(list, &parse_state->idx, $1, NULL, NULL, error, $2);
+ err = parse_events_add_cache(list, &parse_state->idx, $1, NULL, NULL, error, $2,
+ parse_state);
parse_events_terms__delete($2);
free($1);
if (err) {
--
2.7.4

2021-02-08 18:11:40

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 15/49] perf/x86: Factor out x86_pmu_show_pmu_cap

From: Kan Liang <[email protected]>

The PMU capabilities are different among hybrid PMUs. Perf should dump
the PMU capabilities information for each hybrid PMU.

Factor out x86_pmu_show_pmu_cap() which shows the PMU capabilities
information. The function will be reused later when registering a
dedicated hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 25 ++++++++++++++++---------
arch/x86/events/perf_event.h | 3 +++
2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index bdcd3ad..bbd87b7 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1979,6 +1979,20 @@ perf_guest_get_msrs_nop(int *nr)
return NULL;
}

+void x86_pmu_show_pmu_cap(int num_counters, int num_counters_fixed,
+ u64 intel_ctrl)
+{
+ pr_info("... version: %d\n", x86_pmu.version);
+ pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
+ pr_info("... generic registers: %d\n", num_counters);
+ pr_info("... value mask: %016Lx\n", x86_pmu.cntval_mask);
+ pr_info("... max period: %016Lx\n", x86_pmu.max_period);
+ pr_info("... fixed-purpose events: %lu\n",
+ hweight64((((1ULL << num_counters_fixed) - 1)
+ << INTEL_PMC_IDX_FIXED) & intel_ctrl));
+ pr_info("... event mask: %016Lx\n", intel_ctrl);
+}
+
static int __init init_hw_perf_events(void)
{
struct x86_pmu_quirk *quirk;
@@ -2039,15 +2053,8 @@ static int __init init_hw_perf_events(void)

pmu.attr_update = x86_pmu.attr_update;

- pr_info("... version: %d\n", x86_pmu.version);
- pr_info("... bit width: %d\n", x86_pmu.cntval_bits);
- pr_info("... generic registers: %d\n", x86_pmu.num_counters);
- pr_info("... value mask: %016Lx\n", x86_pmu.cntval_mask);
- pr_info("... max period: %016Lx\n", x86_pmu.max_period);
- pr_info("... fixed-purpose events: %lu\n",
- hweight64((((1ULL << x86_pmu.num_counters_fixed) - 1)
- << INTEL_PMC_IDX_FIXED) & x86_pmu.intel_ctrl));
- pr_info("... event mask: %016Lx\n", x86_pmu.intel_ctrl);
+ x86_pmu_show_pmu_cap(x86_pmu.num_counters, x86_pmu.num_counters_fixed,
+ x86_pmu.intel_ctrl);

if (!x86_pmu.read)
x86_pmu.read = _x86_pmu_read;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 560410c..d5fcc15 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1089,6 +1089,9 @@ void x86_pmu_enable_event(struct perf_event *event);

int x86_pmu_handle_irq(struct pt_regs *regs);

+void x86_pmu_show_pmu_cap(int num_counters, int num_counters_fixed,
+ u64 intel_ctrl);
+
extern struct event_constraint emptyconstraint;

extern struct event_constraint unconstrained;
--
2.7.4

2021-02-08 18:12:31

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 16/49] perf/x86: Register hybrid PMUs

From: Kan Liang <[email protected]>

Different hybrid PMUs have different PMU capabilities and events. Perf
should registers a dedicated PMU for each of them.

To check the X86 event, perf has to go through all possible hybrid pmus.

Only the PMU for the boot CPU is registered in init_hw_perf_events()
because the boot CPU is the only online CPU at that moment.
The init_hybrid_pmu() is introduced to register and initialize the other
type of PMUs when the new type of CPU is online.

All hybrid PMUs have capability PERF_PMU_CAP_HETEROGENEOUS_CPUS.
The PMU name for hybrid PMUs will be "cpu_XXX", which will be assigned
later in a separated patch.

The PMU type id for the core PMU is still PERF_TYPE_RAW. For the other
hybrid PMUs, the PMU type id is not hard code.

The event->cpu must be compatitable with the supported CPUs of the PMU.
Add a check in the x86_pmu_event_init().

The events in a group must be from the same type of hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 98 +++++++++++++++++++++++++++++++++-------
arch/x86/events/intel/core.c | 105 ++++++++++++++++++++++++++++++++++++++++++-
arch/x86/events/perf_event.h | 24 ++++++++++
3 files changed, 211 insertions(+), 16 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index bbd87b7..44ad8dc 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -480,7 +480,7 @@ int x86_setup_perfctr(struct perf_event *event)
local64_set(&hwc->period_left, hwc->sample_period);
}

- if (attr->type == PERF_TYPE_RAW)
+ if (attr->type == event->pmu->type)
return x86_pmu_extra_regs(event->attr.config, event);

if (attr->type == PERF_TYPE_HW_CACHE)
@@ -615,7 +615,7 @@ int x86_pmu_hw_config(struct perf_event *event)
if (!event->attr.exclude_kernel)
event->hw.config |= ARCH_PERFMON_EVENTSEL_OS;

- if (event->attr.type == PERF_TYPE_RAW)
+ if (event->attr.type == event->pmu->type)
event->hw.config |= event->attr.config & X86_RAW_EVENT_MASK;

if (event->attr.sample_period && x86_pmu.limit_period) {
@@ -746,7 +746,16 @@ static struct pmu pmu;

static inline int is_x86_event(struct perf_event *event)
{
- return event->pmu == &pmu;
+ int bit;
+
+ if (!IS_X86_HYBRID)
+ return event->pmu == &pmu;
+
+ for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX) {
+ if (event->pmu == &x86_pmu.hybrid_pmu[bit].pmu)
+ return true;
+ }
+ return false;
}

struct pmu *x86_get_pmu(void)
@@ -2053,8 +2062,11 @@ static int __init init_hw_perf_events(void)

pmu.attr_update = x86_pmu.attr_update;

- x86_pmu_show_pmu_cap(x86_pmu.num_counters, x86_pmu.num_counters_fixed,
- x86_pmu.intel_ctrl);
+ if (!IS_X86_HYBRID) {
+ x86_pmu_show_pmu_cap(x86_pmu.num_counters,
+ x86_pmu.num_counters_fixed,
+ x86_pmu.intel_ctrl);
+ }

if (!x86_pmu.read)
x86_pmu.read = _x86_pmu_read;
@@ -2084,9 +2096,36 @@ static int __init init_hw_perf_events(void)
if (err)
goto out1;

- err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
- if (err)
- goto out2;
+ if (!IS_X86_HYBRID) {
+ err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
+ if (err)
+ goto out2;
+ } else {
+ struct x86_hybrid_pmu *hybrid_pmu;
+ int bit;
+
+ for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX) {
+ hybrid_pmu = &x86_pmu.hybrid_pmu[bit];
+
+ hybrid_pmu->pmu = pmu;
+ hybrid_pmu->pmu.type = -1;
+ hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
+ hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
+
+ /* Only register the PMU for the boot CPU */
+ if (bit != x86_hybrid_get_idx_from_cpu(smp_processor_id()))
+ continue;
+
+ if (X86_HYBRID_PMU_CORE_IDX == bit)
+ err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name, PERF_TYPE_RAW);
+ else
+ err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name, -1);
+ if (err)
+ clear_bit(bit, &x86_pmu.hybrid_pmu_bitmap);
+ }
+ if (!x86_pmu.hybrid_pmu_bitmap)
+ goto out2;
+ }

return 0;

@@ -2221,6 +2260,11 @@ static struct cpu_hw_events *allocate_fake_cpuc(void)
return ERR_PTR(-ENOMEM);
cpuc->is_fake = 1;

+ if (IS_X86_HYBRID)
+ cpuc->hybrid_pmu_idx = x86_hybrid_get_idx_from_cpu(cpu);
+ else
+ cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
+
if (intel_cpuc_prepare(cpuc, cpu))
goto error;

@@ -2273,6 +2317,28 @@ static int validate_group(struct perf_event *event)
struct cpu_hw_events *fake_cpuc;
int ret = -EINVAL, n;

+ /*
+ * Reject events from different hybrid PMUs.
+ */
+ if (IS_X86_HYBRID) {
+ struct perf_event *sibling;
+ struct pmu *pmu = NULL;
+
+ if (leader->pmu->task_ctx_nr == perf_hw_context)
+ pmu = leader->pmu;
+ else {
+ for_each_sibling_event(sibling, leader) {
+ if (sibling->pmu->task_ctx_nr == perf_hw_context) {
+ pmu = sibling->pmu;
+ break;
+ }
+ }
+ }
+
+ if (pmu && pmu != event->pmu)
+ return ret;
+ }
+
fake_cpuc = allocate_fake_cpuc();
if (IS_ERR(fake_cpuc))
return PTR_ERR(fake_cpuc);
@@ -2301,16 +2367,18 @@ static int validate_group(struct perf_event *event)

static int x86_pmu_event_init(struct perf_event *event)
{
+ struct x86_hybrid_pmu *hybrid_pmu = NULL;
int err;

- switch (event->attr.type) {
- case PERF_TYPE_RAW:
- case PERF_TYPE_HARDWARE:
- case PERF_TYPE_HW_CACHE:
- break;
-
- default:
+ if ((event->attr.type != event->pmu->type) &&
+ (event->attr.type != PERF_TYPE_HARDWARE) &&
+ (event->attr.type != PERF_TYPE_HW_CACHE))
return -ENOENT;
+
+ if (IS_X86_HYBRID && (event->cpu != -1)) {
+ hybrid_pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+ if (!cpumask_test_cpu(event->cpu, &hybrid_pmu->supported_cpus))
+ return -ENOENT;
}

err = __x86_pmu_event_init(event);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 559b4e9..d2de342 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3721,7 +3721,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
event->hw.flags |= PERF_X86_EVENT_PEBS_VIA_PT;
}

- if (event->attr.type != PERF_TYPE_RAW)
+ if ((event->attr.type == PERF_TYPE_HARDWARE) ||
+ (event->attr.type == PERF_TYPE_HW_CACHE))
return 0;

/*
@@ -4304,12 +4305,97 @@ static void intel_pmu_check_extra_regs(struct extra_reg *extra_regs)
}
}

+static void init_hybrid_pmu(int cpu)
+{
+ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+ int idx = x86_hybrid_get_idx_from_cpu(cpu);
+ struct x86_hybrid_pmu *pmu;
+ struct perf_cpu_context *cpuctx;
+ unsigned int fixed_mask, unused_eax, unused_ebx, unused_edx;
+
+ if (WARN_ON(!IS_VALID_HYBRID_PMU_IDX(idx)))
+ return;
+
+ if (!test_bit(idx, &x86_pmu.hybrid_pmu_bitmap))
+ return;
+
+ cpuc->hybrid_pmu_idx = idx;
+ pmu = &x86_pmu.hybrid_pmu[idx];
+
+ /* Only register PMU for the first CPU */
+ if (!cpumask_empty(&pmu->supported_cpus)) {
+ cpumask_set_cpu(cpu, &pmu->supported_cpus);
+
+ /*
+ * The cpuctx of all CPUs are allocated when registering the
+ * boot CPU's PMU. At that time, the PMU for other hybrid CPUs
+ * is not registered yet. The boot CPU's PMU was
+ * unconditionally assigned to each cpuctx->ctx.pmu.
+ * Update the cpuctx->ctx.pmu when the PMU for other hybrid
+ * CPUs is known.
+ */
+ cpuctx = per_cpu_ptr(pmu->pmu.pmu_cpu_context, cpu);
+ cpuctx->ctx.pmu = &pmu->pmu;
+ return;
+ }
+
+ if ((pmu->pmu.type == -1) &&
+ perf_pmu_register(&pmu->pmu, pmu->name,
+ (idx == X86_HYBRID_PMU_CORE_IDX) ? PERF_TYPE_RAW : -1))
+ return;
+
+ cpuctx = per_cpu_ptr(pmu->pmu.pmu_cpu_context, cpu);
+ cpuctx->ctx.pmu = &pmu->pmu;
+
+ /*
+ * Except for ECX, other fields have been stored in the x86 struct
+ * at boot time.
+ */
+ cpuid(10, &unused_eax, &unused_ebx, &fixed_mask, &unused_edx);
+
+ intel_pmu_check_num_counters(&pmu->num_counters,
+ &pmu->num_counters_fixed,
+ &pmu->intel_ctrl,
+ (u64)fixed_mask);
+
+ pr_info("%s PMU driver: ", pmu->name);
+
+ if (pmu->intel_cap.perf_metrics) {
+ pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ pmu->intel_ctrl |= INTEL_PMC_MSK_FIXED_SLOTS;
+ }
+
+ if (pmu->intel_cap.pebs_output_pt_available) {
+ pmu->pmu.capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
+ pr_cont("PEBS-via-PT ");
+ }
+
+ intel_pmu_check_event_constraints(pmu->event_constraints,
+ pmu->num_counters,
+ pmu->num_counters_fixed,
+ pmu->intel_ctrl);
+
+ intel_pmu_check_extra_regs(pmu->extra_regs);
+
+ pr_cont("\n");
+
+ x86_pmu_show_pmu_cap(pmu->num_counters, pmu->num_counters_fixed,
+ pmu->intel_ctrl);
+
+ cpumask_set_cpu(cpu, &pmu->supported_cpus);
+
+ return;
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
int core_id = topology_core_id(cpu);
int i;

+ if (IS_X86_HYBRID)
+ init_hybrid_pmu(cpu);
+
init_debug_store_on_cpu(cpu);
/*
* Deal with CPUs that don't clear their LBRs on power-up.
@@ -4424,6 +4510,23 @@ void intel_cpuc_finish(struct cpu_hw_events *cpuc)
static void intel_pmu_cpu_dead(int cpu)
{
intel_cpuc_finish(&per_cpu(cpu_hw_events, cpu));
+
+ if (IS_X86_HYBRID) {
+ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+ int idx = x86_hybrid_get_idx_from_cpu(cpu);
+ struct x86_hybrid_pmu *hybrid_pmu;
+
+ if (WARN_ON(!IS_VALID_HYBRID_PMU_IDX(idx)))
+ return;
+
+ hybrid_pmu = &x86_pmu.hybrid_pmu[idx];
+ cpumask_clear_cpu(cpu, &hybrid_pmu->supported_cpus);
+ cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
+ if (cpumask_empty(&hybrid_pmu->supported_cpus)) {
+ perf_pmu_unregister(&hybrid_pmu->pmu);
+ hybrid_pmu->pmu.type = -1;
+ }
+ }
}

static void intel_pmu_sched_task(struct perf_event_context *ctx,
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index d5fcc15..740ba48 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -643,9 +643,14 @@ enum x86_hybrid_pmu_type_idx {
X86_HYBRID_PMU_MAX_INDEX
};

+#define X86_HYBRID_ATOM_CPU_TYPE 0x20
+#define X86_HYBRID_CORE_CPU_TYPE 0x40

struct x86_hybrid_pmu {
struct pmu pmu;
+ const char *name;
+ u32 cpu_type;
+ cpumask_t supported_cpus;
union perf_capabilities intel_cap;
u64 intel_ctrl;
int max_pebs_events;
@@ -679,6 +684,25 @@ struct x86_hybrid_pmu {
#define X86_HYBRID_READ_FROM_EVENT(_name, _event) \
(IS_X86_HYBRID ? ((struct x86_hybrid_pmu *)(_event->pmu))->_name : x86_pmu._name)

+#define IS_VALID_HYBRID_PMU_IDX(idx) \
+ (idx < X86_HYBRID_PMU_MAX_INDEX && idx > X86_NON_HYBRID_PMU)
+
+static inline enum x86_hybrid_pmu_type_idx
+x86_hybrid_get_idx_from_cpu(unsigned int cpu)
+{
+ unsigned int cpu_type = cpu_data(cpu).x86_cpu_type >> X86_HYBRID_CPU_TYPE_ID_SHIFT;
+
+ switch (cpu_type) {
+ case X86_HYBRID_ATOM_CPU_TYPE:
+ return X86_HYBRID_PMU_ATOM_IDX;
+ case X86_HYBRID_CORE_CPU_TYPE:
+ return X86_HYBRID_PMU_CORE_IDX;
+ default:
+ pr_warn("CPU %u: invalid cpu type %u\n", cpu, cpu_type);
+ }
+ return X86_NON_HYBRID_PMU;
+}
+
/*
* struct x86_pmu - generic x86 pmu
*/
--
2.7.4

2021-02-08 18:13:52

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 41/49] perf stat: Support metrics with hybrid events

From: Jin Yao <[email protected]>

One metric such as 'Kernel_Utilization' may be from different PMUs and
consists of different events.

For core,
Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread

For atom,
Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core

The metric group string is:
"{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W,{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}:W"

It's internally expanded to:
"{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W#cpu_core,{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}:W#cpu_atom"

That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W"
is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}"
is from cpu_atom PMU. And then next, checks if the events in group are
valid on that PMU. If one event is not valid on that PMU, the associated
group would be removed internally.

In this example, cpu_clk_unhalted.thread is valid on cpu_core and
cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two
groups are passed.

Now it reports:

root@otcpl-adl-s-2:~# ./perf stat -M Kernel_Utilization -a -- sleep 1

Performance counter stats for 'system wide':

15,302,356 cpu_clk_unhalted.thread:k # 0.96 Kernel_Utilization
16,016,529 cpu_clk_unhalted.thread
3,865,478 cpu_clk_unhalted.core:k # 0.82 Kernel_Utilization
4,699,692 cpu_clk_unhalted.core

1.002409000 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/metricgroup.c | 220 ++++++++++++++++++++++++++++++++++++++---
tools/perf/util/stat-display.c | 5 +-
2 files changed, 213 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index df05134..36f2035 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -122,6 +122,7 @@ struct metric {
const char *metric_name;
const char *metric_expr;
const char *metric_unit;
+ const char *pmu_name;
struct list_head metric_refs;
int metric_refs_cnt;
int runtime;
@@ -185,7 +186,8 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
bool metric_no_merge,
bool has_constraint,
struct evsel **metric_events,
- unsigned long *evlist_used)
+ unsigned long *evlist_used,
+ const char *pmu_name)
{
struct evsel *ev, *current_leader = NULL;
struct expr_id_data *val_ptr;
@@ -234,8 +236,13 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
if (contains_event(metric_events, matched_events, ev->name))
continue;
/* Does this event belong to the parse context? */
- if (hashmap__find(&pctx->ids, ev->name, (void **)&val_ptr))
+ if (hashmap__find(&pctx->ids, ev->name, (void **)&val_ptr)) {
+ if (evsel__is_hybrid_event(ev) && pmu_name &&
+ strcmp(pmu_name, ev->pmu_name)) {
+ continue;
+ }
metric_events[matched_events++] = ev;
+ }

if (matched_events == events_to_match)
break;
@@ -323,7 +330,7 @@ static int metricgroup__setup_events(struct list_head *groups,
evsel = find_evsel_group(perf_evlist, &m->pctx,
metric_no_merge,
m->has_constraint, metric_events,
- evlist_used);
+ evlist_used, m->pmu_name);
if (!evsel) {
pr_debug("Cannot resolve %s: %s\n",
m->metric_name, m->metric_expr);
@@ -684,7 +691,8 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
}

static void metricgroup__add_metric_weak_group(struct strbuf *events,
- struct expr_parse_ctx *ctx)
+ struct expr_parse_ctx *ctx,
+ const char *pmu_name)
{
struct hashmap_entry *cur;
size_t bkt;
@@ -708,6 +716,8 @@ static void metricgroup__add_metric_weak_group(struct strbuf *events,
}
if (!no_group) {
strbuf_addf(events, "}:W");
+ if (pmu_name)
+ strbuf_addf(events, "#%s", pmu_name);
if (has_duration)
strbuf_addf(events, ",duration_time");
} else if (has_duration)
@@ -801,6 +811,7 @@ static int __add_metric(struct list_head *metric_list,
m->metric_name = pe->metric_name;
m->metric_expr = pe->metric_expr;
m->metric_unit = pe->unit;
+ m->pmu_name = pe->pmu;
m->runtime = runtime;
m->has_constraint = metric_no_group || metricgroup__has_constraint(pe);
INIT_LIST_HEAD(&m->metric_refs);
@@ -1084,7 +1095,8 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
static int metricgroup__add_metric(const char *metric, bool metric_no_group,
struct strbuf *events,
struct list_head *metric_list,
- struct pmu_events_map *map)
+ struct pmu_events_map *map,
+ const char *pmu_name)
{
struct expr_ids ids = { .cnt = 0, };
struct pmu_event *pe;
@@ -1097,6 +1109,9 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
has_match = true;
m = NULL;

+ if (pmu_name && pe->pmu && strcmp(pmu_name, pe->pmu))
+ continue;
+
ret = add_metric(&list, pe, metric_no_group, &m, NULL, &ids);
if (ret)
goto out;
@@ -1142,7 +1157,8 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
&m->pctx);
} else {
metricgroup__add_metric_weak_group(events,
- &m->pctx);
+ &m->pctx,
+ m->pmu_name);
}
}

@@ -1159,7 +1175,8 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
static int metricgroup__add_metric_list(const char *list, bool metric_no_group,
struct strbuf *events,
struct list_head *metric_list,
- struct pmu_events_map *map)
+ struct pmu_events_map *map,
+ const char *pmu_name)
{
char *llist, *nlist, *p;
int ret = -EINVAL;
@@ -1174,7 +1191,7 @@ static int metricgroup__add_metric_list(const char *list, bool metric_no_group,

while ((p = strsep(&llist, ",")) != NULL) {
ret = metricgroup__add_metric(p, metric_no_group, events,
- metric_list, map);
+ metric_list, map, pmu_name);
if (ret == -EINVAL) {
fprintf(stderr, "Cannot find metric or group `%s'\n",
p);
@@ -1211,6 +1228,172 @@ static void metricgroup__free_metrics(struct list_head *metric_list)
}
}

+static char *get_metric_pmus(char *ostr, struct strbuf *metrc_pmus,
+ bool *pmus_inited)
+{
+ char *llist, *nlist, *p1, *p2, *new_str;
+ struct strbuf new_events;
+
+ *pmus_inited = false;
+ if (!strchr(ostr, '#')) {
+ /*
+ * pmu name is added after '#'. If no '#' found,
+ * don't need to process pmu.
+ */
+ return strdup(ostr);
+ }
+
+ nlist = strdup(ostr);
+ if (!nlist)
+ return NULL;
+
+ strbuf_init(&new_events, 100);
+ strbuf_addf(&new_events, "%s", "");
+
+ strbuf_init(metrc_pmus, 100);
+ strbuf_addf(metrc_pmus, "%s", "");
+ *pmus_inited = true;
+
+ llist = nlist;
+ while ((p1 = strsep(&llist, ",")) != NULL) {
+ p2 = strchr(p1, '#');
+ if (p2) {
+ *p2 = 0;
+ strbuf_addf(&new_events, "%s,", p1);
+ strbuf_addf(metrc_pmus, "%s,", p2 + 1);
+ } else {
+ strbuf_addf(&new_events, "%s,", p1);
+ }
+ }
+
+ new_str = strdup(new_events.buf);
+ if (new_str) {
+ /* Remove last ',' */
+ new_str[strlen(new_str) - 1] = 0;
+ }
+
+ free(nlist);
+ strbuf_release(&new_events);
+ return new_str;
+}
+
+static void set_pmu_unmatched_events(struct evlist *evlist, int group_idx,
+ char *pmu_name,
+ unsigned long *evlist_removed)
+{
+ struct evsel *evsel, *pos;
+ int i = 0, j = 0;
+
+ /*
+ * Move to the first evsel of a given group
+ */
+ evlist__for_each_entry (evlist, evsel) {
+ if (evsel__is_group_leader(evsel) &&
+ evsel->core.nr_members >= 1) {
+ if (i < group_idx) {
+ j += evsel->core.nr_members;
+ i++;
+ continue;
+ } else
+ break;
+ }
+ }
+
+ i = 0;
+ evlist__for_each_entry (evlist, evsel) {
+ if (i < j) {
+ i++;
+ continue;
+ }
+
+ /*
+ * Now we are at the first evsel in the group
+ */
+ for_each_group_evsel(pos, evsel) {
+ if (evsel__is_hybrid_event(pos) &&
+ strcmp(pos->pmu_name, pmu_name)) {
+ set_bit(pos->idx, evlist_removed);
+ }
+ }
+ break;
+ }
+}
+
+static void remove_pmu_umatched_events(struct evlist *evlist, char *metric_pmus)
+{
+ struct evsel *evsel, *tmp, *new_leader = NULL;
+ unsigned long *evlist_removed;
+ char *llist, *nlist, *p1;
+ bool need_new_leader = false;
+ int i = 0, new_nr_members = 0;
+
+ nlist = strdup(metric_pmus);
+ if (!nlist)
+ return;
+
+ evlist_removed = bitmap_alloc(evlist->core.nr_entries);
+ if (!evlist_removed) {
+ free(nlist);
+ return;
+ }
+
+ llist = nlist;
+ while ((p1 = strsep(&llist, ",")) != NULL) {
+ if (strlen(p1) > 0) {
+ /*
+ * p1 points to the string of pmu name, e.g. "cpu_atom".
+ * The metric group string has pmu suffixes, e.g.
+ * "{inst_retired.any,cpu_clk_unhalted.thread}:W#cpu_core,
+ * {cpu_clk_unhalted.core,inst_retired.any_p}:W#cpu_atom"
+ * By counting the pmu name, we can know the index of
+ * group.
+ */
+ set_pmu_unmatched_events(evlist, i++, p1, evlist_removed);
+ }
+ }
+
+ evlist__for_each_entry_safe(evlist, tmp, evsel) {
+ if (test_bit(evsel->idx, evlist_removed)) {
+ if (!evsel__is_group_leader(evsel)) {
+ if (!need_new_leader) {
+ if (new_leader)
+ new_leader->leader->core.nr_members--;
+ else
+ evsel->leader->core.nr_members--;
+ } else
+ new_nr_members--;
+ } else {
+ /*
+ * If group leader is to remove, we need to
+ * prepare a new leader and adjust all group
+ * members.
+ */
+ need_new_leader = true;
+ new_nr_members = evsel->leader->core.nr_members - 1;
+ }
+
+ evlist__remove(evlist, evsel);
+ evsel__delete(evsel);
+ } else {
+ if (!evsel__is_group_leader(evsel)) {
+ if (need_new_leader) {
+ need_new_leader = false;
+ new_leader = evsel;
+ new_leader->leader = new_leader;
+ new_leader->core.nr_members = new_nr_members;
+ } else if (new_leader)
+ evsel->leader = new_leader;
+ } else {
+ need_new_leader = false;
+ new_leader = NULL;
+ }
+ }
+ }
+
+ bitmap_free(evlist_removed);
+ free(nlist);
+}
+
static int parse_groups(struct evlist *perf_evlist, const char *str,
bool metric_no_group,
bool metric_no_merge,
@@ -1219,28 +1402,43 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
struct pmu_events_map *map)
{
struct parse_events_error parse_error;
- struct strbuf extra_events;
+ struct strbuf extra_events, metric_pmus;
LIST_HEAD(metric_list);
int ret;
+ char *nlist;
+ bool pmus_inited = false;

if (metric_events->nr_entries == 0)
metricgroup__rblist_init(metric_events);
ret = metricgroup__add_metric_list(str, metric_no_group,
- &extra_events, &metric_list, map);
+ &extra_events, &metric_list, map,
+ perf_evlist->pmu_name);
if (ret)
goto out;
pr_debug("adding %s\n", extra_events.buf);
bzero(&parse_error, sizeof(parse_error));
- ret = __parse_events(perf_evlist, extra_events.buf, &parse_error, fake_pmu);
+ nlist = get_metric_pmus(extra_events.buf, &metric_pmus, &pmus_inited);
+ if (!nlist)
+ return -1;
+
+ ret = __parse_events(perf_evlist, nlist, &parse_error, fake_pmu);
if (ret) {
parse_events_print_error(&parse_error, extra_events.buf);
+ free(nlist);
goto out;
}
+
+ if (pmus_inited)
+ remove_pmu_umatched_events(perf_evlist, metric_pmus.buf);
+
+ free(nlist);
ret = metricgroup__setup_events(&metric_list, metric_no_merge,
perf_evlist, metric_events);
out:
metricgroup__free_metrics(&metric_list);
strbuf_release(&extra_events);
+ if (pmus_inited)
+ strbuf_release(&metric_pmus);
return ret;
}

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 583ae4f..961d5ac 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -565,7 +565,10 @@ static void collect_all_aliases(struct perf_stat_config *config, struct evsel *c
alias->cgrp != counter->cgrp ||
strcmp(alias->unit, counter->unit) ||
evsel__is_clock(alias) != evsel__is_clock(counter) ||
- !strcmp(alias->pmu_name, counter->pmu_name))
+ !strcmp(alias->pmu_name, counter->pmu_name) ||
+ (evsel__is_hybrid_event(alias) &&
+ evsel__is_hybrid_event(counter) &&
+ strcmp(alias->pmu_name, counter->pmu_name)))
break;
alias->merged_stat = true;
cb(config, alias, data, false);
--
2.7.4

2021-02-08 18:14:28

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 40/49] perf stat: Support --cputype option for hybrid events

From: Jin Yao <[email protected]>

In previous patch, we have supported the syntax which enables
the event on a specified pmu, such as:

cpu_core/<event>/
cpu_atom/<event>/

While this syntax is not very easy for applying on a set of
events or applying on a group. In following example, we have to
explicitly assign the pmu prefix.

root@otcpl-adl-s-2:~# ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1

Performance counter stats for 'sleep 1':

1,660,562 cycles
944,537 instructions

1.001678000 seconds time elapsed

A much easier way is:

root@otcpl-adl-s-2:~# ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1

Performance counter stats for 'sleep 1':

887,232 cycles
877,314 instructions

1.002520551 seconds time elapsed

The '--cputype' enables the events from specified pmu (cpu_core).

If '--cputype' conflicts with pmu prefix, '--cputype' is ignored and
a warning is displayed.

root@otcpl-adl-s-2:~# ./perf stat --cputype atom -e '{cpu_core/cycles/}' -- sleep 1
WARNING: cputype (cpu_atom) conflicts with event pmu (cpu_core), use event pmu (cpu_core)

Performance counter stats for 'sleep 1':

1,441,979 cycles

1.001177738 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 4 ++++
tools/perf/builtin-stat.c | 24 +++++++++++++++++++
tools/perf/util/evlist.h | 1 +
tools/perf/util/parse-events.c | 43 +++++++++++++++++++++++++++++++++-
tools/perf/util/parse-events.h | 1 +
5 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 796772c..b0e357d 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -450,6 +450,10 @@ convenient for post processing.
--summary::
Print summary for interval mode (-I).

+--cputype::
+Only enable events on applying cpu with this type for hybrid platform
+(e.g. core or atom)"
+
EXAMPLES
--------

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index afb8789..44d1a5f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1117,6 +1117,26 @@ static int parse_stat_cgroups(const struct option *opt,
return parse_cgroups(opt, str, unset);
}

+static int parse_hybrid_type(const struct option *opt,
+ const char *str,
+ int unset __maybe_unused)
+{
+ struct evlist *evlist = *(struct evlist **)opt->value;
+
+ if (!list_empty(&evlist->core.entries)) {
+ fprintf(stderr, "Must define cputype before events/metrics\n");
+ return -1;
+ }
+
+ evlist->pmu_name = perf_pmu__hybrid_type_to_pmu(str);
+ if (!evlist->pmu_name) {
+ fprintf(stderr, "--cputype %s is not supported!\n", str);
+ return -1;
+ }
+
+ return 0;
+}
+
static struct option stat_options[] = {
OPT_BOOLEAN('T', "transaction", &transaction_run,
"hardware transaction statistics"),
@@ -1221,6 +1241,10 @@ static struct option stat_options[] = {
"print summary for interval mode"),
OPT_BOOLEAN(0, "quiet", &stat_config.quiet,
"don't print output (useful with record)"),
+ OPT_CALLBACK(0, "cputype", &evsel_list, "hybrid cpu type",
+ "Only enable events on applying cpu with this type "
+ "for hybrid platform (e.g. core or atom)",
+ parse_hybrid_type),
#ifdef HAVE_LIBPFM
OPT_CALLBACK(0, "pfm-events", &evsel_list, "event",
"libpfm4 event selector. use 'perf list' to list available events",
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 9741df4..c06b9ff 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -66,6 +66,7 @@ struct evlist {
struct evsel *selected;
struct events_stats stats;
struct perf_env *env;
+ const char *pmu_name;
void (*trace_event_sample_raw)(struct evlist *evlist,
union perf_event *event,
struct perf_sample *sample);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6d7a2ce..8cdabaa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -498,7 +498,13 @@ static int add_hybrid_cache(struct list_head *list, int *idx,
*hybrid = false;
perf_pmu__for_each_hybrid_pmus(pmu) {
*hybrid = true;
- if (parse_state->pmu_name &&
+
+ if (parse_state->evlist && parse_state->evlist->pmu_name &&
+ strcmp(parse_state->evlist->pmu_name, pmu->name)) {
+ continue;
+ }
+
+ if (parse_state->pmu_name &&
strcmp(parse_state->pmu_name, pmu->name)) {
continue;
}
@@ -512,6 +518,19 @@ static int add_hybrid_cache(struct list_head *list, int *idx,
return 0;
}

+static void warn_pmu_conflict(struct parse_events_state *parse_state)
+{
+ if (parse_state->evlist2 && parse_state->evlist2->pmu_name &&
+ parse_state->pmu_name &&
+ strcmp(parse_state->evlist2->pmu_name, parse_state->pmu_name)) {
+ WARN_ONCE(1, "WARNING: cputype (%s) conflicts with event "
+ "pmu (%s), use event pmu (%s)\n",
+ parse_state->evlist2->pmu_name,
+ parse_state->pmu_name,
+ parse_state->pmu_name);
+ }
+}
+
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2,
struct parse_events_error *err,
@@ -588,6 +607,8 @@ int parse_events_add_cache(struct list_head *list, int *idx,
if (!perf_pmu__hybrid_exist())
perf_pmu__scan(NULL);

+ warn_pmu_conflict(parse_state);
+
ret = add_hybrid_cache(list, idx, &attr, config_name ? : name,
&config_terms, &hybrid, parse_state);
if (hybrid)
@@ -1519,6 +1540,12 @@ static int add_hybrid_numeric(struct parse_events_state *parse_state,
*hybrid = false;
perf_pmu__for_each_hybrid_pmus(pmu) {
*hybrid = true;
+
+ if (parse_state->evlist && parse_state->evlist->pmu_name &&
+ strcmp(parse_state->evlist->pmu_name, pmu->name)) {
+ continue;
+ }
+
if (parse_state->pmu_name &&
strcmp(parse_state->pmu_name, pmu->name)) {
continue;
@@ -1566,6 +1593,12 @@ static int add_hybrid_raw(struct parse_events_state *parse_state,
*hybrid = false;
perf_pmu__for_each_hybrid_pmus(pmu) {
*hybrid = true;
+
+ if (parse_state->evlist && parse_state->evlist->pmu_name &&
+ strcmp(parse_state->evlist->pmu_name, pmu->name)) {
+ continue;
+ }
+
if (parse_state->pmu_name &&
strcmp(parse_state->pmu_name, pmu->name)) {
continue;
@@ -1604,6 +1637,8 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
return -ENOMEM;
}

+ warn_pmu_conflict(parse_state);
+
/*
* Skip the software dummy event.
*/
@@ -1702,6 +1737,11 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
return -EINVAL;
}

+ if (parse_state->evlist->pmu_name && perf_pmu__is_hybrid(name) &&
+ strcmp(parse_state->evlist->pmu_name, name)) {
+ return -EINVAL;
+ }
+
if (pmu->default_config) {
memcpy(&attr, pmu->default_config,
sizeof(struct perf_event_attr));
@@ -2397,6 +2437,7 @@ static int parse_events_with_hybrid_pmu(struct parse_events_state *parse_state,
.stoken = PE_START_EVENTS,
.pmu_name = pmu_name,
.idx = parse_state->idx,
+ .evlist2 = parse_state->evlist,
};
int ret;

diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 6c91abc..c0d8a16 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -139,6 +139,7 @@ struct parse_events_state {
int stoken;
struct perf_pmu *fake_pmu;
char *pmu_name;
+ struct evlist *evlist2;
};

void parse_events__handle_error(struct parse_events_error *err, int idx,
--
2.7.4

2021-02-08 18:14:31

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 18/49] perf/x86/intel: Add attr_update for Hybrid PMUs

From: Kan Liang <[email protected]>

The attribute_group for Hybrid PMUs should be different from the previous
cpu PMU. For example, cpumask is required for a Hybrid PMU. The PMU type
should be included in the event and format attribute.

Add hybrid_attr_update for the Hybrid PMU.
Check the PMU type in is_visible() function. Only display the event or
format for the matched Hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 120 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 114 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d2de342..ea2541b 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5254,6 +5254,106 @@ static const struct attribute_group *attr_update[] = {
NULL,
};

+static bool is_attr_for_this_pmu(struct kobject *kobj, struct attribute *attr)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ struct perf_pmu_events_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_events_hybrid_attr, attr.attr);
+
+ return pmu->cpu_type & pmu_attr->pmu_type;
+}
+
+static umode_t hybrid_events_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ return is_attr_for_this_pmu(kobj, attr) ? attr->mode : 0;
+}
+
+static inline int hybrid_find_supported_cpu(struct x86_hybrid_pmu *pmu)
+{
+ int cpu = cpumask_first(&pmu->supported_cpus);
+
+ return (cpu >= nr_cpu_ids) ? -1 : cpu;
+}
+
+static umode_t hybrid_tsx_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ int cpu = hybrid_find_supported_cpu(pmu);
+
+ return (cpu >= 0) && is_attr_for_this_pmu(kobj, attr) && cpu_has(&cpu_data(cpu), X86_FEATURE_RTM) ? attr->mode : 0;
+}
+
+static umode_t hybrid_format_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ struct perf_pmu_format_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_format_hybrid_attr, attr.attr);
+ int cpu = hybrid_find_supported_cpu(pmu);;
+
+ return (cpu >= 0) && (pmu->cpu_type & pmu_attr->pmu_type) ? attr->mode : 0;
+}
+
+static struct attribute_group hybrid_group_events_td = {
+ .name = "events",
+ .is_visible = hybrid_events_is_visible,
+};
+
+static struct attribute_group hybrid_group_events_mem = {
+ .name = "events",
+ .is_visible = hybrid_events_is_visible,
+};
+
+static struct attribute_group hybrid_group_events_tsx = {
+ .name = "events",
+ .is_visible = hybrid_tsx_is_visible,
+};
+
+static struct attribute_group hybrid_group_format_extra = {
+ .name = "format",
+ .is_visible = hybrid_format_is_visible,
+};
+
+static ssize_t intel_hybrid_get_attr_cpus(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+
+ return cpumap_print_to_pagebuf(true, buf, &pmu->supported_cpus);
+}
+
+static DEVICE_ATTR(cpus, S_IRUGO, intel_hybrid_get_attr_cpus, NULL);
+static struct attribute *intel_hybrid_cpus_attrs[] = {
+ &dev_attr_cpus.attr,
+ NULL,
+};
+
+static struct attribute_group hybrid_group_cpus = {
+ .attrs = intel_hybrid_cpus_attrs,
+};
+
+static const struct attribute_group *hybrid_attr_update[] = {
+ &hybrid_group_events_td,
+ &hybrid_group_events_mem,
+ &hybrid_group_events_tsx,
+ &group_caps_gen,
+ &group_caps_lbr,
+ &hybrid_group_format_extra,
+ &group_default,
+ &hybrid_group_cpus,
+ NULL,
+};
+
static struct attribute *empty_attrs;

__init int intel_pmu_init(void)
@@ -5884,14 +5984,22 @@ __init int intel_pmu_init(void)

snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);

+ if (!IS_X86_HYBRID) {
+ group_events_td.attrs = td_attr;
+ group_events_mem.attrs = mem_attr;
+ group_events_tsx.attrs = tsx_attr;
+ group_format_extra.attrs = extra_attr;
+ group_format_extra_skl.attrs = extra_skl_attr;

- group_events_td.attrs = td_attr;
- group_events_mem.attrs = mem_attr;
- group_events_tsx.attrs = tsx_attr;
- group_format_extra.attrs = extra_attr;
- group_format_extra_skl.attrs = extra_skl_attr;
+ x86_pmu.attr_update = attr_update;
+ } else {
+ hybrid_group_events_td.attrs = td_attr;
+ hybrid_group_events_mem.attrs = mem_attr;
+ hybrid_group_events_tsx.attrs = tsx_attr;
+ hybrid_group_format_extra.attrs = extra_attr;

- x86_pmu.attr_update = attr_update;
+ x86_pmu.attr_update = hybrid_attr_update;
+ }

intel_pmu_check_num_counters(&x86_pmu.num_counters,
&x86_pmu.num_counters_fixed,
--
2.7.4

2021-02-08 18:14:59

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 35/49] perf parse-events: Create two hybrid hardware events

From: Jin Yao <[email protected]>

For hardware events, they have pre-defined configs. The kernel
needs to know where the event comes from (e.g. from cpu_core pmu
or from cpu_atom pmu). But the perf type 'PERF_TYPE_HARDWARE'
can't carry pmu information.

So the kernel introduces a new type 'PERF_TYPE_HARDWARE_PMU'.
The new attr.config layout for PERF_TYPE_HARDWARE_PMU is:

0xDD000000AA
AA: original hardware event ID
DD: PMU type ID

PMU type ID is retrieved from sysfs. For example,

cat /sys/devices/cpu_atom/type
10

cat /sys/devices/cpu_core/type
4

When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.

root@otcpl-adl-s-2:~# ./perf stat -e cycles -vv -a -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
cycles: 0: 1254337 1001292571 1001292571
cycles: 1: 2595141 1001279813 1001279813
cycles: 2: 134853 1001276406 1001276406
cycles: 3: 81119 1001271089 1001271089
cycles: 4: 251353 1001264678 1001264678
cycles: 5: 415593 1001259163 1001259163
cycles: 6: 129643 1001265312 1001265312
cycles: 7: 80289 1001258979 1001258979
cycles: 8: 169983 1001251207 1001251207
cycles: 9: 81981 1001245487 1001245487
cycles: 10: 4116221 1001245537 1001245537
cycles: 11: 85531 1001253097 1001253097
cycles: 12: 3969132 1001254270 1001254270
cycles: 13: 96006 1001254691 1001254691
cycles: 14: 385004 1001244971 1001244971
cycles: 15: 394446 1001251437 1001251437
cycles: 0: 427330 1001253457 1001253457
cycles: 1: 444043 1001255914 1001255914
cycles: 2: 97285 1001253555 1001253555
cycles: 3: 92071 1001260556 1001260556
cycles: 4: 86292 1001249896 1001249896
cycles: 5: 236851 1001238979 1001238979
cycles: 6: 100081 1001239792 1001239792
cycles: 7: 72836 1001243276 1001243276
cycles: 14240632 16020168708 16020168708
cycles: 1556789 8009995425 8009995425

Performance counter stats for 'system wide':

14,240,632 cycles
1,556,789 cycles

1.002261231 seconds time elapsed

type 6 is PERF_TYPE_HARDWARE_PMU.
0x4 in 0x400000000 indicates the cpu_core pmu.
0xa in 0xa00000000 indicates the cpu_atom pmu.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/parse-events.c | 73 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 73 insertions(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 81a6fce..1e767dc 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -446,6 +446,24 @@ static int config_attr(struct perf_event_attr *attr,
struct parse_events_error *err,
config_term_func_t config_term);

+static void config_hybrid_attr(struct perf_event_attr *attr,
+ int type, int pmu_type)
+{
+ /*
+ * attr.config layout:
+ * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
+ * AA: hardware event ID
+ * DD: PMU type ID
+ * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * DD: PMU type ID
+ */
+ attr->type = type;
+ attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
+}
+
int parse_events_add_cache(struct list_head *list, int *idx,
char *type, char *op_result1, char *op_result2,
struct parse_events_error *err,
@@ -1409,6 +1427,47 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
err, head_config);
}

+static int create_hybrid_hw_event(struct parse_events_state *parse_state,
+ struct list_head *list,
+ struct perf_event_attr *attr,
+ struct perf_pmu *pmu)
+{
+ struct evsel *evsel;
+ __u32 type = attr->type;
+ __u64 config = attr->config;
+
+ config_hybrid_attr(attr, PERF_TYPE_HARDWARE_PMU, pmu->type);
+ evsel = __add_event(list, &parse_state->idx, attr, true, NULL,
+ pmu, NULL, false, NULL);
+ if (evsel)
+ evsel->pmu_name = strdup(pmu->name);
+ else
+ return -ENOMEM;
+
+ attr->type = type;
+ attr->config = config;
+ return 0;
+}
+
+static int add_hybrid_numeric(struct parse_events_state *parse_state,
+ struct list_head *list,
+ struct perf_event_attr *attr,
+ bool *hybrid)
+{
+ struct perf_pmu *pmu;
+ int ret;
+
+ *hybrid = false;
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ *hybrid = true;
+ ret = create_hybrid_hw_event(parse_state, list, attr, pmu);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
int parse_events_add_numeric(struct parse_events_state *parse_state,
struct list_head *list,
u32 type, u64 config,
@@ -1416,6 +1475,8 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
{
struct perf_event_attr attr;
LIST_HEAD(config_terms);
+ bool hybrid;
+ int ret;

memset(&attr, 0, sizeof(attr));
attr.type = type;
@@ -1430,6 +1491,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
return -ENOMEM;
}

+ /*
+ * Skip the software dummy event.
+ */
+ if (type != PERF_TYPE_SOFTWARE) {
+ if (!perf_pmu__hybrid_exist())
+ perf_pmu__scan(NULL);
+
+ ret = add_hybrid_numeric(parse_state, list, &attr, &hybrid);
+ if (hybrid)
+ return ret;
+ }
+
return add_event(list, &parse_state->idx, &attr,
get_config_name(head_config), &config_terms);
}
--
2.7.4

2021-02-08 18:14:59

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 17/49] perf/x86: Add structures for the attributes of Hybrid PMUs

From: Kan Liang <[email protected]>

Hybrid PMUs have different events and formats. In theory, Hybrid PMU
specific attributes should be maintained in the dedicated struct
x86_hybrid_pmu, but it wastes space because the events and formats are
similar among Hybrid PMUs.

To reduce duplication, all hybrid PMUs will share a group of attributes
in the following patch. To distinguish an attribute from different
Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type
is required for the attribute structure. The type is internal usage. It
is not visible in the sysfs API.

Hybrid PMUs may support the same event name, but with different event
encoding, e.g., the mem-loads event on an Atom PMU has different event
encoding from a Core PMU. It brings issue if two attributes are
created for them. Current sysfs_update_group finds an attribute by
searching the attr name (aka event name). If two attributes have the
same event name, the first attribute will be replaced.
To address the issue, only one attribute is created for the event. The
event_str is extended and stores event encodings from all Hybrid PMUs.
Each event encoding is divided by ";". The order of the event encodings
must follow the order of the hybrid PMU index. The event_str is internal
usage as well. When a user wants to show the attribute of a Hybrid PMU,
only the corresponding part of the string is displayed.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 43 +++++++++++++++++++++++++++++++++++++++++++
arch/x86/events/perf_event.h | 19 +++++++++++++++++++
include/linux/perf_event.h | 12 ++++++++++++
3 files changed, 74 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 44ad8dc..4d9dd83c 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1855,6 +1855,49 @@ ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
pmu_attr->event_str_noht);
}

+ssize_t events_hybrid_sysfs_show(struct device *dev,
+ struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_events_hybrid_attr, attr);
+ struct x86_hybrid_pmu *pmu;
+ const char *str, *next_str;
+ int i;
+
+ if (hweight64(pmu_attr->pmu_type) == 1)
+ return sprintf(page, "%s", pmu_attr->event_str);
+
+ /*
+ * Hybrid PMUs may support the same event name, but with different
+ * event encoding, e.g., the mem-loads event on an Atom PMU has
+ * different event encoding from a Core PMU.
+ *
+ * The event_str includes all event encodings. Each event encoding
+ * is divided by ";". The order of the event encodings must follow
+ * the order of the hybrid PMU index.
+ */
+ pmu = container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+
+ str = pmu_attr->event_str;
+ for (i = 0; i < X86_HYBRID_PMU_MAX_INDEX; i++) {
+ if (!(x86_pmu.hybrid_pmu[i].cpu_type & pmu_attr->pmu_type))
+ continue;
+ if (x86_pmu.hybrid_pmu[i].cpu_type & pmu->cpu_type) {
+ next_str = strchr(str, ';');
+ if (next_str)
+ return snprintf(page, next_str - str + 1, "%s", str);
+ else
+ return sprintf(page, "%s", str);
+ }
+ str = strchr(str, ';');
+ str++;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(events_hybrid_sysfs_show);
+
EVENT_ATTR(cpu-cycles, CPU_CYCLES );
EVENT_ATTR(instructions, INSTRUCTIONS );
EVENT_ATTR(cache-references, CACHE_REFERENCES );
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 740ba48..84d629d 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -989,6 +989,22 @@ static struct perf_pmu_events_ht_attr event_attr_##v = { \
.event_str_ht = ht, \
}

+#define EVENT_ATTR_STR_HYBRID(_name, v, str, _pmu) \
+static struct perf_pmu_events_hybrid_attr event_attr_##v = { \
+ .attr = __ATTR(_name, 0444, events_hybrid_sysfs_show, NULL),\
+ .id = 0, \
+ .event_str = str, \
+ .pmu_type = _pmu, \
+}
+
+#define FORMAT_HYBRID_PTR(_id) (&format_attr_hybrid_##_id.attr.attr)
+
+#define FORMAT_ATTR_HYBRID(_name, _pmu) \
+static struct perf_pmu_format_hybrid_attr format_attr_hybrid_##_name = {\
+ .attr = __ATTR_RO(_name), \
+ .pmu_type = _pmu, \
+}
+
struct pmu *x86_get_pmu(void);
extern struct x86_pmu x86_pmu __read_mostly;

@@ -1156,6 +1172,9 @@ ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);
ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);
+ssize_t events_hybrid_sysfs_show(struct device *dev,
+ struct device_attribute *attr,
+ char *page);

static inline bool fixed_counter_disabled(int i, struct cpu_hw_events *cpuc)
{
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index fab42cf..21ab3f5 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1546,6 +1546,18 @@ struct perf_pmu_events_ht_attr {
const char *event_str_noht;
};

+struct perf_pmu_events_hybrid_attr {
+ struct device_attribute attr;
+ u64 id;
+ const char *event_str;
+ u64 pmu_type;
+};
+
+struct perf_pmu_format_hybrid_attr {
+ struct device_attribute attr;
+ u64 pmu_type;
+};
+
ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);

--
2.7.4

2021-02-08 18:17:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 19/49] perf/x86: Support filter_match callback

From: Kan Liang <[email protected]>

Implement filter_match callback for X86, which check whether an event is
schedulable on the current CPU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 10 ++++++++++
arch/x86/events/perf_event.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 4d9dd83c..b68d38a 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2613,6 +2613,14 @@ static int x86_pmu_aux_output_match(struct perf_event *event)
return 0;
}

+static int x86_pmu_filter_match(struct perf_event *event)
+{
+ if (x86_pmu.filter_match)
+ return x86_pmu.filter_match(event);
+
+ return 1;
+}
+
static struct pmu pmu = {
.pmu_enable = x86_pmu_enable,
.pmu_disable = x86_pmu_disable,
@@ -2640,6 +2648,8 @@ static struct pmu pmu = {
.check_period = x86_pmu_check_period,

.aux_output_match = x86_pmu_aux_output_match,
+
+ .filter_match = x86_pmu_filter_match,
};

void arch_perf_update_userpage(struct perf_event *event,
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 84d629d..5759f96 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -900,6 +900,7 @@ struct x86_pmu {
*/
unsigned long hybrid_pmu_bitmap;
struct x86_hybrid_pmu hybrid_pmu[X86_HYBRID_PMU_MAX_INDEX];
+ int (*filter_match)(struct perf_event *event);
};

struct x86_perf_task_context_opt {
--
2.7.4

2021-02-08 18:18:28

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 23/49] perf/x86/msr: Add Alder Lake CPU support

From: Kan Liang <[email protected]>

PPERF and SMI_COUNT MSRs are also supported on Alder Lake.

The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.

The patch has been tested on real hardware.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/msr.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 680404c..c853b28 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -100,6 +100,8 @@ static bool test_intel(int idx, void *data)
case INTEL_FAM6_TIGERLAKE_L:
case INTEL_FAM6_TIGERLAKE:
case INTEL_FAM6_ROCKETLAKE:
+ case INTEL_FAM6_ALDERLAKE:
+ case INTEL_FAM6_ALDERLAKE_L:
if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
return true;
break;
--
2.7.4

2021-02-08 18:18:36

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 20/49] perf/x86/intel: Add Alder Lake Hybrid support

From: Kan Liang <[email protected]>

Alder Lake Hybrid system has two different types of core, Golden Cove
core and Gracemont core. The Golden Cove core is registered to
"cpu_core" PMU. The Gracemont core is registered to "cpu_atom" PMU.

The difference between the two PMUs include:
- Number of GP and fixed counters
- Events
- The "cpu_core" PMU supports Topdown metrics.
The "cpu_atom" PMU supports PEBS-via-PT.

The "cpu_core" PMU is similar to the Sapphire Rapids PMU, but without
PMEM.
The "cpu_atom" PMU is similar to Tremont, but with different
event_constraints, extra_regs and number of counters.

Users may disable all CPUs of the same CPU type on the command line or
in the BIOS. For this case, perf still initializes a PMU for the CPU
type, but will not register the PMU. The PMU will only be registered
when any corresponding CPU is online.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 227 +++++++++++++++++++++++++++++++++++++++++++
arch/x86/events/intel/ds.c | 7 ++
arch/x86/events/perf_event.h | 2 +
3 files changed, 236 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ea2541b..fcbf72f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2076,6 +2076,14 @@ static struct extra_reg intel_tnt_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};

+static struct extra_reg intel_grt_extra_regs[] __read_mostly = {
+ /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
+ INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3fffffffffull, RSP_0),
+ INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x3fffffffffull, RSP_1),
+ INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0),
+ EVENT_EXTRA_END
+};
+
#define KNL_OT_L2_HITE BIT_ULL(19) /* Other Tile L2 Hit */
#define KNL_OT_L2_HITF BIT_ULL(20) /* Other Tile L2 Hit */
#define KNL_MCDRAM_LOCAL BIT_ULL(21)
@@ -2430,6 +2438,16 @@ static int icl_set_topdown_event_period(struct perf_event *event)
return 0;
}

+static int adl_set_topdown_event_period(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+
+ if (pmu->cpu_type != X86_HYBRID_CORE_CPU_TYPE)
+ return 0;
+
+ return icl_set_topdown_event_period(event);
+}
+
static inline u64 icl_get_metrics_event_value(u64 metric, u64 slots, int idx)
{
u32 val;
@@ -2570,6 +2588,17 @@ static u64 icl_update_topdown_event(struct perf_event *event)
x86_pmu.num_topdown_events - 1);
}

+static u64 adl_update_topdown_event(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+
+ if (pmu->cpu_type != X86_HYBRID_CORE_CPU_TYPE)
+ return 0;
+
+ return icl_update_topdown_event(event);
+}
+
+
static void intel_pmu_read_topdown_event(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -4063,6 +4092,32 @@ tfa_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
return c;
}

+static struct event_constraint *
+adl_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
+ struct perf_event *event)
+{
+ if (cpuc->hybrid_pmu_idx == X86_HYBRID_PMU_CORE_IDX)
+ return spr_get_event_constraints(cpuc, idx, event);
+ else if (cpuc->hybrid_pmu_idx == X86_HYBRID_PMU_ATOM_IDX)
+ return tnt_get_event_constraints(cpuc, idx, event);
+
+ WARN_ON(1);
+ return &emptyconstraint;
+}
+
+static int adl_hw_config(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+
+ if (pmu->cpu_type == X86_HYBRID_CORE_CPU_TYPE)
+ return hsw_hw_config(event);
+ else if (pmu->cpu_type == X86_HYBRID_ATOM_CPU_TYPE)
+ return intel_pmu_hw_config(event);
+
+ WARN_ON(1);
+ return -EOPNOTSUPP;
+}
+
/*
* Broadwell:
*
@@ -4555,6 +4610,14 @@ static int intel_pmu_aux_output_match(struct perf_event *event)
return is_intel_pt_event(event);
}

+static int intel_pmu_filter_match(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
+ unsigned int cpu = smp_processor_id();
+
+ return cpumask_test_cpu(cpu, &pmu->supported_cpus);
+}
+
PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63");

PMU_FORMAT_ATTR(ldlat, "config1:0-15");
@@ -5254,6 +5317,84 @@ static const struct attribute_group *attr_update[] = {
NULL,
};

+EVENT_ATTR_STR_HYBRID(slots, slots_hybrid, "event=0x00,umask=0x4", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-retiring, td_retiring_hybrid, "event=0xc2,umask=0x0;event=0x00,umask=0x80", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-bad-spec, td_bad_spec_hybrid, "event=0x73,umask=0x0;event=0x00,umask=0x81", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-fe-bound, td_fe_bound_hybrid, "event=0x71,umask=0x0;event=0x00,umask=0x82", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-be-bound, td_be_bound_hybrid, "event=0x74,umask=0x0;event=0x00,umask=0x83", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-heavy-ops, td_heavy_ops_hybrid, "event=0x00,umask=0x84", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-br-mispredict, td_br_mispredict_hybrid, "event=0x00,umask=0x85", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-fetch-lat, td_fetch_lat_hybrid, "event=0x00,umask=0x86", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(topdown-mem-bound, td_mem_bound_hybrid, "event=0x00,umask=0x87", X86_HYBRID_CORE_CPU_TYPE);
+
+static struct attribute *adl_hybrid_events_attrs[] = {
+ EVENT_PTR(slots_hybrid),
+ EVENT_PTR(td_retiring_hybrid),
+ EVENT_PTR(td_bad_spec_hybrid),
+ EVENT_PTR(td_fe_bound_hybrid),
+ EVENT_PTR(td_be_bound_hybrid),
+ EVENT_PTR(td_heavy_ops_hybrid),
+ EVENT_PTR(td_br_mispredict_hybrid),
+ EVENT_PTR(td_fetch_lat_hybrid),
+ EVENT_PTR(td_mem_bound_hybrid),
+ NULL,
+};
+
+/* Must be in IDX order */
+EVENT_ATTR_STR_HYBRID(mem-loads, mem_ld_adl_hybrid, "event=0xd0,umask=0x5,ldlat=3;event=0xcd,umask=0x1,ldlat=3", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(mem-stores, mem_st_adl_hybrid, "event=0xd0,umask=0x6;event=0xcd,umask=0x2", X86_HYBRID_ATOM_CPU_TYPE | X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(mem-loads-aux, mem_ld_aux_hybrid, "event=0x03,umask=0x82", X86_HYBRID_CORE_CPU_TYPE);
+
+static struct attribute *adl_hybrid_mem_attrs[] = {
+ EVENT_PTR(mem_ld_adl_hybrid),
+ EVENT_PTR(mem_st_adl_hybrid),
+ EVENT_PTR(mem_ld_aux_hybrid),
+ NULL,
+};
+
+EVENT_ATTR_STR_HYBRID(tx-start, tx_start_adl_hybrid, "event=0xc9,umask=0x1", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(tx-commit, tx_commit_adl_hybrid, "event=0xc9,umask=0x2", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(tx-abort, tx_abort_adl_hybrid, "event=0xc9,umask=0x4", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(tx-conflict, tx_conflict_adl_hybrid, "event=0x54,umask=0x1", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(cycles-t, cycles_t_adl_hybrid, "event=0x3c,in_tx=1", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(cycles-ct, cycles_ct_adl_hybrid, "event=0x3c,in_tx=1,in_tx_cp=1", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(tx-capacity-read, tx_capacity_read_adl_hybrid, "event=0x54,umask=0x80", X86_HYBRID_CORE_CPU_TYPE);
+EVENT_ATTR_STR_HYBRID(tx-capacity-write, tx_capacity_write_adl_hybrid, "event=0x54,umask=0x2", X86_HYBRID_CORE_CPU_TYPE);
+
+static struct attribute *adl_hybrid_tsx_attrs[] = {
+ EVENT_PTR(tx_start_adl_hybrid),
+ EVENT_PTR(tx_abort_adl_hybrid),
+ EVENT_PTR(tx_commit_adl_hybrid),
+ EVENT_PTR(tx_capacity_read_adl_hybrid),
+ EVENT_PTR(tx_capacity_write_adl_hybrid),
+ EVENT_PTR(tx_conflict_adl_hybrid),
+ EVENT_PTR(cycles_t_adl_hybrid),
+ EVENT_PTR(cycles_ct_adl_hybrid),
+ NULL,
+};
+
+FORMAT_ATTR_HYBRID(in_tx, X86_HYBRID_CORE_CPU_TYPE);
+FORMAT_ATTR_HYBRID(in_tx_cp, X86_HYBRID_CORE_CPU_TYPE);
+FORMAT_ATTR_HYBRID(offcore_rsp, X86_HYBRID_CORE_CPU_TYPE | X86_HYBRID_ATOM_CPU_TYPE);
+FORMAT_ATTR_HYBRID(ldlat, X86_HYBRID_CORE_CPU_TYPE | X86_HYBRID_ATOM_CPU_TYPE);
+FORMAT_ATTR_HYBRID(frontend, X86_HYBRID_CORE_CPU_TYPE);
+
+static struct attribute *adl_hybrid_extra_attr_rtm[] = {
+ FORMAT_HYBRID_PTR(in_tx),
+ FORMAT_HYBRID_PTR(in_tx_cp),
+ FORMAT_HYBRID_PTR(offcore_rsp),
+ FORMAT_HYBRID_PTR(ldlat),
+ FORMAT_HYBRID_PTR(frontend),
+ NULL,
+};
+
+static struct attribute *adl_hybrid_extra_attr[] = {
+ FORMAT_HYBRID_PTR(offcore_rsp),
+ FORMAT_HYBRID_PTR(ldlat),
+ FORMAT_HYBRID_PTR(frontend),
+ NULL,
+};
+
static bool is_attr_for_this_pmu(struct kobject *kobj, struct attribute *attr)
{
struct device *dev = kobj_to_dev(kobj);
@@ -5370,6 +5511,7 @@ __init int intel_pmu_init(void)
bool pmem = false;
int version, i;
char *name;
+ struct x86_hybrid_pmu *pmu;

if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -5964,6 +6106,91 @@ __init int intel_pmu_init(void)
name = "sapphire_rapids";
break;

+ case INTEL_FAM6_ALDERLAKE:
+ case INTEL_FAM6_ALDERLAKE_L:
+ /*
+ * Alder Lake has 2 types of CPU, core and atom.
+ *
+ * Initialize the common PerfMon capabilities here.
+ */
+ x86_pmu.late_ack = true;
+ x86_pmu.pebs_aliases = NULL;
+ x86_pmu.pebs_prec_dist = true;
+ x86_pmu.flags |= PMU_FL_HAS_RSP_1;
+ x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
+ x86_pmu.flags |= PMU_FL_PEBS_ALL;
+ x86_pmu.flags |= PMU_FL_INSTR_LATENCY;
+ x86_pmu.flags |= PMU_FL_MEM_LOADS_AUX;
+ x86_pmu.lbr_pt_coexist = true;
+ intel_pmu_pebs_data_source_skl(false);
+ x86_pmu.num_topdown_events = 8;
+ x86_pmu.update_topdown_event = adl_update_topdown_event;
+ x86_pmu.set_topdown_event_period = adl_set_topdown_event_period;
+
+ x86_pmu.filter_match = intel_pmu_filter_match;
+ x86_pmu.get_event_constraints = adl_get_event_constraints;
+ x86_pmu.hw_config = adl_hw_config;
+ x86_pmu.limit_period = spr_limit_period;
+ /*
+ * The rtm_abort_event is used to check whether to enable GPRs
+ * for the RTM abort event. Atom doesn't have the RTM abort
+ * event. There is no harmful to set it in the common
+ * x86_pmu.rtm_abort_event.
+ */
+ x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
+
+ td_attr = adl_hybrid_events_attrs;
+ mem_attr = adl_hybrid_mem_attrs;
+ tsx_attr = adl_hybrid_tsx_attrs;
+ extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
+ adl_hybrid_extra_attr_rtm : adl_hybrid_extra_attr;
+
+ /* Initialize big core specific PerfMon capabilities.*/
+ set_bit(X86_HYBRID_PMU_CORE_IDX, &x86_pmu.hybrid_pmu_bitmap);
+ pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
+ pmu->name = "cpu_core";
+ pmu->cpu_type = X86_HYBRID_CORE_CPU_TYPE;
+ pmu->num_counters = x86_pmu.num_counters + 2;
+ pmu->num_counters_fixed = x86_pmu.num_counters_fixed + 1;
+ pmu->max_pebs_events = min_t(unsigned, MAX_PEBS_EVENTS, pmu->num_counters);
+ pmu->unconstrained = (struct event_constraint)
+ __EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
+ 0, pmu->num_counters, 0, 0);
+ pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
+ pmu->intel_cap.perf_metrics = 1;
+ pmu->intel_cap.pebs_output_pt_available = 0;
+
+ memcpy(pmu->hw_cache_event_ids, spr_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
+ memcpy(pmu->hw_cache_extra_regs, spr_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
+ pmu->event_constraints = intel_spr_event_constraints;
+ pmu->pebs_constraints = intel_spr_pebs_event_constraints;
+ pmu->extra_regs = intel_spr_extra_regs;
+
+ /* Initialize Atom core specific PerfMon capabilities.*/
+ set_bit(X86_HYBRID_PMU_ATOM_IDX, &x86_pmu.hybrid_pmu_bitmap);
+ pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX];
+ pmu->name = "cpu_atom";
+ pmu->cpu_type = X86_HYBRID_ATOM_CPU_TYPE;
+ pmu->num_counters = x86_pmu.num_counters;
+ pmu->num_counters_fixed = x86_pmu.num_counters_fixed;
+ pmu->max_pebs_events = x86_pmu.max_pebs_events;
+ pmu->unconstrained = (struct event_constraint)
+ __EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
+ 0, pmu->num_counters, 0, 0);
+ pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
+ pmu->intel_cap.perf_metrics = 0;
+ pmu->intel_cap.pebs_output_pt_available = 1;
+
+ memcpy(pmu->hw_cache_event_ids, glp_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
+ memcpy(pmu->hw_cache_extra_regs, tnt_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
+ pmu->hw_cache_event_ids[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = -1;
+ pmu->event_constraints = intel_slm_event_constraints;
+ pmu->pebs_constraints = intel_grt_pebs_event_constraints;
+ pmu->extra_regs = intel_grt_extra_regs;
+ pr_cont("Alderlake Hybrid events, ");
+ name = "alderlake_hybrid";
+ break;
+
default:
switch (x86_pmu.version) {
case 1:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index ba651d9..1783fcf 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -779,6 +779,13 @@ struct event_constraint intel_glm_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};

+struct event_constraint intel_grt_pebs_event_constraints[] = {
+ /* Allow all events as PEBS with no flags */
+ INTEL_PLD_CONSTRAINT(0x5d0, 0xf),
+ INTEL_PSD_CONSTRAINT(0x6d0, 0xf),
+ EVENT_CONSTRAINT_END
+};
+
struct event_constraint intel_nehalem_pebs_event_constraints[] = {
INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */
INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 5759f96..de193e6 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1265,6 +1265,8 @@ extern struct event_constraint intel_glm_pebs_event_constraints[];

extern struct event_constraint intel_glp_pebs_event_constraints[];

+extern struct event_constraint intel_grt_pebs_event_constraints[];
+
extern struct event_constraint intel_nehalem_pebs_event_constraints[];

extern struct event_constraint intel_westmere_pebs_event_constraints[];
--
2.7.4

2021-02-08 18:18:45

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 21/49] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU

From: Kan Liang <[email protected]>

Current Hardware events and Hardware cache events have special perf
types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't
pass the PMU type in the user interface. For a hybrid system, the perf
subsystem doesn't know which PMU the events belong to. The first capable
PMU will always be assigned to the events. The events never get a chance
to run on the other capable PMUs.

Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
PERF_TYPE_HW_CACHE_PMU. The PMU type ID is stored at attr.config[40:32].
Support the new types for X86.

Suggested-by: Andi Kleen <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 10 ++++++++--
include/uapi/linux/perf_event.h | 26 ++++++++++++++++++++++++++
kernel/events/core.c | 14 +++++++++++++-
3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index b68d38a..c48e37c 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -483,7 +483,7 @@ int x86_setup_perfctr(struct perf_event *event)
if (attr->type == event->pmu->type)
return x86_pmu_extra_regs(event->attr.config, event);

- if (attr->type == PERF_TYPE_HW_CACHE)
+ if ((attr->type == PERF_TYPE_HW_CACHE) || (attr->type == PERF_TYPE_HW_CACHE_PMU))
return set_ext_hw_attr(hwc, event);

if (attr->config >= x86_pmu.max_events)
@@ -2415,9 +2415,15 @@ static int x86_pmu_event_init(struct perf_event *event)

if ((event->attr.type != event->pmu->type) &&
(event->attr.type != PERF_TYPE_HARDWARE) &&
- (event->attr.type != PERF_TYPE_HW_CACHE))
+ (event->attr.type != PERF_TYPE_HW_CACHE) &&
+ (event->attr.type != PERF_TYPE_HARDWARE_PMU) &&
+ (event->attr.type != PERF_TYPE_HW_CACHE_PMU))
return -ENOENT;

+ if ((event->attr.type == PERF_TYPE_HARDWARE_PMU) ||
+ (event->attr.type == PERF_TYPE_HW_CACHE_PMU))
+ event->attr.config &= PERF_HW_CACHE_EVENT_MASK;
+
if (IS_X86_HYBRID && (event->cpu != -1)) {
hybrid_pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
if (!cpumask_test_cpu(event->cpu, &hybrid_pmu->supported_cpus))
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 7d292de5..83ab6a6 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -33,6 +33,8 @@ enum perf_type_id {
PERF_TYPE_HW_CACHE = 3,
PERF_TYPE_RAW = 4,
PERF_TYPE_BREAKPOINT = 5,
+ PERF_TYPE_HARDWARE_PMU = 6,
+ PERF_TYPE_HW_CACHE_PMU = 7,

PERF_TYPE_MAX, /* non-ABI */
};
@@ -95,6 +97,30 @@ enum perf_hw_cache_op_result_id {
};

/*
+ * attr.config layout for type PERF_TYPE_HARDWARE* and PERF_TYPE_HW_CACHE*
+ * PERF_TYPE_HARDWARE: 0xAA
+ * AA: hardware event ID
+ * PERF_TYPE_HW_CACHE: 0xCCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
+ * AA: hardware event ID
+ * DD: PMU type ID
+ * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * DD: PMU type ID
+ */
+#define PERF_HW_CACHE_ID_SHIFT 0
+#define PERF_HW_CACHE_OP_ID_SHIFT 8
+#define PERF_HW_CACHE_OP_RESULT_ID_SHIFT 16
+#define PERF_HW_CACHE_EVENT_MASK 0xffffff
+
+#define PERF_PMU_TYPE_SHIFT 32
+
+/*
* Special "software" events provided by the kernel, even if the hardware
* does not support performance events. These events measure various
* physical and sw events of the kernel (and allow the profiling of them as
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5206097..04465b2 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11052,6 +11052,14 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
return ret;
}

+static bool perf_event_is_hw_pmu_type(struct perf_event *event)
+{
+ int type = event->attr.type;
+
+ return type == PERF_TYPE_HARDWARE_PMU ||
+ type == PERF_TYPE_HW_CACHE_PMU;
+}
+
static struct pmu *perf_init_event(struct perf_event *event)
{
int idx, type, ret;
@@ -11075,13 +11083,17 @@ static struct pmu *perf_init_event(struct perf_event *event)
if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE)
type = PERF_TYPE_RAW;

+ if (perf_event_is_hw_pmu_type(event))
+ type = event->attr.config >> PERF_PMU_TYPE_SHIFT;
+
again:
rcu_read_lock();
pmu = idr_find(&pmu_idr, type);
rcu_read_unlock();
if (pmu) {
ret = perf_try_init_event(pmu, event);
- if (ret == -ENOENT && event->attr.type != type) {
+ if (ret == -ENOENT && event->attr.type != type &&
+ !perf_event_is_hw_pmu_type(event)) {
type = event->attr.type;
goto again;
}
--
2.7.4

2021-02-08 18:19:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 44/49] perf stat: Uniquify hybrid event name

From: Jin Yao <[email protected]>

It would be useful to tell user the pmu which the event belongs to.
perf-stat has supported '--no-merge' option and it can print the pmu
name after the event name.

Now this option is enabled by default for hybrid platform.

Before:

root@otcpl-adl-s-2:~# ./perf stat -e cycles -a -- sleep 1

Performance counter stats for 'system wide':

10,301,466 cycles
1,557,794 cycles

1.002068584 seconds time elapsed

After:

root@otcpl-adl-s-2:~# ./perf stat -e cycles -a -- sleep 1

Performance counter stats for 'system wide':

11,190,657 cycles [cpu_core]
669,063 cycles [cpu_atom]

1.002147571 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-stat.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 0b08665..bfe7305 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2379,6 +2379,9 @@ int cmd_stat(int argc, const char **argv)

evlist__check_cpu_maps(evsel_list);

+ if (perf_pmu__hybrid_exist())
+ stat_config.no_merge = true;
+
/*
* Initialize thread_map with comm names,
* so we could print it out on output.
--
2.7.4

2021-02-08 18:19:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 24/49] perf/x86/cstate: Add Alder Lake CPU support

From: Kan Liang <[email protected]>

Compared with the Rocket Lake, the CORE C1 Residency Counter is added
for Alder Lake, but the CORE C3 Residency Counter is removed. Other
counters are the same.

Create a new adl_cstates for Alder Lake. Update the comments
accordingly.

The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.

The patch has been tested on real hardware.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/cstate.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 407eee5..4333990 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,7 +40,7 @@
* Model specific counters:
* MSR_CORE_C1_RES: CORE C1 Residency Counter
* perf code: 0x00
- * Available model: SLM,AMT,GLM,CNL,TNT
+ * Available model: SLM,AMT,GLM,CNL,TNT,ADL
* Scope: Core (each processor core has a MSR)
* MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
* perf code: 0x01
@@ -51,46 +51,49 @@
* perf code: 0x02
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Core
* MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
* perf code: 0x03
* Available model: SNB,IVB,HSW,BDW,SKL,CNL,KBL,CML,
- * ICL,TGL,RKL
+ * ICL,TGL,RKL,ADL
* Scope: Core
* MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter.
* perf code: 0x00
* Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
- * KBL,CML,ICL,TGL,TNT,RKL
+ * KBL,CML,ICL,TGL,TNT,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter.
* perf code: 0x01
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
- * GLM,CNL,KBL,CML,ICL,TGL,TNT,RKL
+ * GLM,CNL,KBL,CML,ICL,TGL,TNT,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C6_RESIDENCY: Package C6 Residency Counter.
* perf code: 0x02
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter.
* perf code: 0x03
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL,
- * KBL,CML,ICL,TGL,RKL
+ * KBL,CML,ICL,TGL,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C8_RESIDENCY: Package C8 Residency Counter.
* perf code: 0x04
- * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL
+ * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C9_RESIDENCY: Package C9 Residency Counter.
* perf code: 0x05
- * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL
+ * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
* perf code: 0x06
* Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Package (physical package)
*
*/
@@ -563,6 +566,20 @@ static const struct cstate_model icl_cstates __initconst = {
BIT(PERF_CSTATE_PKG_C10_RES),
};

+static const struct cstate_model adl_cstates __initconst = {
+ .core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES) |
+ BIT(PERF_CSTATE_CORE_C7_RES),
+
+ .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C7_RES) |
+ BIT(PERF_CSTATE_PKG_C8_RES) |
+ BIT(PERF_CSTATE_PKG_C9_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
static const struct cstate_model slm_cstates __initconst = {
.core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
BIT(PERF_CSTATE_CORE_C6_RES),
@@ -650,6 +667,8 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &icl_cstates),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_cstates),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_cstates),
{ },
};
MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
--
2.7.4

2021-02-08 18:19:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 22/49] perf/x86/intel/uncore: Add Alder Lake support

From: Kan Liang <[email protected]>

The uncore subsystem for Alder Lake is similar to the previous Tiger
Lake.

The difference includes:
- New MSR addresses for global control, fixed counters, CBOX and ARB.
Add a new adl_uncore_msr_ops for uncore operations.
- Add a new threshold field for CBOX.
- New PCIIDs for IMC devices.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/uncore.c | 7 ++
arch/x86/events/intel/uncore.h | 1 +
arch/x86/events/intel/uncore_snb.c | 131 +++++++++++++++++++++++++++++++++++++
3 files changed, 139 insertions(+)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 33c8180..3ad5df2 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1625,6 +1625,11 @@ static const struct intel_uncore_init_fun rkl_uncore_init __initconst = {
.pci_init = skl_uncore_pci_init,
};

+static const struct intel_uncore_init_fun adl_uncore_init __initconst = {
+ .cpu_init = adl_uncore_cpu_init,
+ .mmio_init = tgl_uncore_mmio_init,
+};
+
static const struct intel_uncore_init_fun icx_uncore_init __initconst = {
.cpu_init = icx_uncore_cpu_init,
.pci_init = icx_uncore_pci_init,
@@ -1673,6 +1678,8 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &tgl_l_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &tgl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &rkl_uncore_init),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_uncore_init),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, &snr_uncore_init),
{},
};
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index a3c6e16..30e6557 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -567,6 +567,7 @@ void snb_uncore_cpu_init(void);
void nhm_uncore_cpu_init(void);
void skl_uncore_cpu_init(void);
void icl_uncore_cpu_init(void);
+void adl_uncore_cpu_init(void);
void tgl_uncore_cpu_init(void);
void tgl_uncore_mmio_init(void);
void tgl_l_uncore_mmio_init(void);
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 5127128..0f63706 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -62,6 +62,8 @@
#define PCI_DEVICE_ID_INTEL_TGL_H_IMC 0x9a36
#define PCI_DEVICE_ID_INTEL_RKL_1_IMC 0x4c43
#define PCI_DEVICE_ID_INTEL_RKL_2_IMC 0x4c53
+#define PCI_DEVICE_ID_INTEL_ADL_1_IMC 0x4660
+#define PCI_DEVICE_ID_INTEL_ADL_2_IMC 0x4641

/* SNB event control */
#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
@@ -131,12 +133,33 @@
#define ICL_UNC_ARB_PER_CTR 0x3b1
#define ICL_UNC_ARB_PERFEVTSEL 0x3b3

+/* ADL uncore global control */
+#define ADL_UNC_PERF_GLOBAL_CTL 0x2ff0
+#define ADL_UNC_FIXED_CTR_CTRL 0x2fde
+#define ADL_UNC_FIXED_CTR 0x2fdf
+
+/* ADL Cbo register */
+#define ADL_UNC_CBO_0_PER_CTR0 0x2002
+#define ADL_UNC_CBO_0_PERFEVTSEL0 0x2000
+#define ADL_UNC_CTL_THRESHOLD 0x3f000000
+#define ADL_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
+ SNB_UNC_CTL_UMASK_MASK | \
+ SNB_UNC_CTL_EDGE_DET | \
+ SNB_UNC_CTL_INVERT | \
+ ADL_UNC_CTL_THRESHOLD)
+
+/* ADL ARB register */
+#define ADL_UNC_ARB_PER_CTR0 0x2FD2
+#define ADL_UNC_ARB_PERFEVTSEL0 0x2FD0
+#define ADL_UNC_ARB_MSR_OFFSET 0x8
+
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(cmask5, cmask, "config:24-28");
DEFINE_UNCORE_FORMAT_ATTR(cmask8, cmask, "config:24-31");
+DEFINE_UNCORE_FORMAT_ATTR(threshold, threshold, "config:24-29");

/* Sandy Bridge uncore support */
static void snb_uncore_msr_enable_event(struct intel_uncore_box *box, struct perf_event *event)
@@ -422,6 +445,106 @@ void tgl_uncore_cpu_init(void)
skl_uncore_msr_ops.init_box = rkl_uncore_msr_init_box;
}

+static void adl_uncore_msr_init_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+}
+
+static void adl_uncore_msr_enable_box(struct intel_uncore_box *box)
+{
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+}
+
+static void adl_uncore_msr_disable_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, 0);
+}
+
+static void adl_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, 0);
+}
+
+static struct intel_uncore_ops adl_uncore_msr_ops = {
+ .init_box = adl_uncore_msr_init_box,
+ .enable_box = adl_uncore_msr_enable_box,
+ .disable_box = adl_uncore_msr_disable_box,
+ .exit_box = adl_uncore_msr_exit_box,
+ .disable_event = snb_uncore_msr_disable_event,
+ .enable_event = snb_uncore_msr_enable_event,
+ .read_counter = uncore_msr_read_counter,
+};
+
+static struct attribute *adl_uncore_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_threshold.attr,
+ NULL,
+};
+
+static const struct attribute_group adl_uncore_format_group = {
+ .name = "format",
+ .attrs = adl_uncore_formats_attr,
+};
+
+static struct intel_uncore_type adl_uncore_cbox = {
+ .name = "cbox",
+ .num_counters = 2,
+ .perf_ctr_bits = 44,
+ .perf_ctr = ADL_UNC_CBO_0_PER_CTR0,
+ .event_ctl = ADL_UNC_CBO_0_PERFEVTSEL0,
+ .event_mask = ADL_UNC_RAW_EVENT_MASK,
+ .msr_offset = ICL_UNC_CBO_MSR_OFFSET,
+ .ops = &adl_uncore_msr_ops,
+ .format_group = &adl_uncore_format_group,
+};
+
+static struct intel_uncore_type adl_uncore_arb = {
+ .name = "arb",
+ .num_counters = 2,
+ .num_boxes = 2,
+ .perf_ctr_bits = 44,
+ .perf_ctr = ADL_UNC_ARB_PER_CTR0,
+ .event_ctl = ADL_UNC_ARB_PERFEVTSEL0,
+ .event_mask = SNB_UNC_RAW_EVENT_MASK,
+ .msr_offset = ADL_UNC_ARB_MSR_OFFSET,
+ .constraints = snb_uncore_arb_constraints,
+ .ops = &adl_uncore_msr_ops,
+ .format_group = &snb_uncore_format_group,
+};
+
+static struct intel_uncore_type adl_uncore_clockbox = {
+ .name = "clock",
+ .num_counters = 1,
+ .num_boxes = 1,
+ .fixed_ctr_bits = 48,
+ .fixed_ctr = ADL_UNC_FIXED_CTR,
+ .fixed_ctl = ADL_UNC_FIXED_CTR_CTRL,
+ .single_fixed = 1,
+ .event_mask = SNB_UNC_CTL_EV_SEL_MASK,
+ .format_group = &icl_uncore_clock_format_group,
+ .ops = &adl_uncore_msr_ops,
+ .event_descs = icl_uncore_events,
+};
+
+static struct intel_uncore_type *adl_msr_uncores[] = {
+ &adl_uncore_cbox,
+ &adl_uncore_arb,
+ &adl_uncore_clockbox,
+ NULL,
+};
+
+void adl_uncore_cpu_init(void)
+{
+ adl_uncore_cbox.num_boxes = icl_get_cbox_num();
+ uncore_msr_uncores = adl_msr_uncores;
+}
+
enum {
SNB_PCI_UNCORE_IMC,
};
@@ -1203,6 +1326,14 @@ static const struct pci_device_id tgl_uncore_pci_ids[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_TGL_H_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ADL_1_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ADL_2_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
{ /* end: all zeroes */ }
};

--
2.7.4

2021-02-08 18:19:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 43/49] perf stat: Add default hybrid events

From: Jin Yao <[email protected]>

Previously if '-e' is not specified in perf stat, some software events
and hardware events are added to evlist by default.

root@otcpl-adl-s-2:~# ./perf stat -- ./triad_loop

Performance counter stats for './triad_loop':

109.43 msec task-clock # 0.993 CPUs utilized
1 context-switches # 0.009 K/sec
0 cpu-migrations # 0.000 K/sec
105 page-faults # 0.960 K/sec
401,161,982 cycles # 3.666 GHz
1,601,216,357 instructions # 3.99 insn per cycle
200,217,751 branches # 1829.686 M/sec
14,555 branch-misses # 0.01% of all branches

0.110176860 seconds time elapsed

Among the events, cycles, instructions, branches and branch-misses
are hardware events.

One hybrid platform, two events are created for one hardware event.

core cycles,
atom cycles,
core instructions,
atom instructions,
core branches,
atom branches,
core branch-misses,
atom branch-misses

These events will be added to evlist in order on hybrid platform
if '-e' is not set.

Since parse_events() has been supported to create two hardware events
for one event on hybrid platform, so we just use parse_events(evlist,
"cycles,instructions,branches,branch-misses") to create the default
events and add them to evlist.

After:
root@otcpl-adl-s-2:~# ./perf stat -vv -- taskset -c 16 ./triad_loop
...
------------------------------------------------------------
perf_event_attr:
type 1
size 120
config 0x1
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 1
size 120
config 0x3
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 4
------------------------------------------------------------
perf_event_attr:
type 1
size 120
config 0x4
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
type 1
size 120
config 0x2
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 7
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 8
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 9
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000001
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 10
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000001
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000004
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 12
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000004
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 13
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000005
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 14
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000005
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
...

Performance counter stats for 'taskset -c 16 ./triad_loop':

201.31 msec task-clock # 0.997 CPUs utilized
1 context-switches # 0.005 K/sec
1 cpu-migrations # 0.005 K/sec
166 page-faults # 0.825 K/sec
623,267,134 cycles # 3096.043 M/sec (0.16%)
603,082,383 cycles # 2995.777 M/sec (99.84%)
406,410,481 instructions # 2018.820 M/sec (0.16%)
1,604,213,375 instructions # 7968.837 M/sec (99.84%)
81,444,171 branches # 404.569 M/sec (0.16%)
200,616,430 branches # 996.550 M/sec (99.84%)
3,769,856 branch-misses # 18.727 M/sec (0.16%)
16,111 branch-misses # 0.080 M/sec (99.84%)

0.201895853 seconds time elapsed

We can see two events are created for one hardware event.
First one is core event the second one is atom event.

One thing is, the shadow stats looks a bit different, now it's just
'M/sec'.

The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
need to be improved in future if we want to get the original shadow
stats.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-stat.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 44d1a5f..0b08665 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1137,6 +1137,13 @@ static int parse_hybrid_type(const struct option *opt,
return 0;
}

+static int add_default_hybrid_events(struct evlist *evlist)
+{
+ struct parse_events_error err;
+
+ return parse_events(evlist, "cycles,instructions,branches,branch-misses", &err);
+}
+
static struct option stat_options[] = {
OPT_BOOLEAN('T', "transaction", &transaction_run,
"hardware transaction statistics"),
@@ -1613,6 +1620,12 @@ static int add_default_attributes(void)
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};
+ struct perf_event_attr default_sw_attrs[] = {
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
+};

/*
* Detailed stats (-d), covering the L1 and last level data caches:
@@ -1849,6 +1862,15 @@ static int add_default_attributes(void)
}

if (!evsel_list->core.nr_entries) {
+ perf_pmu__scan(NULL);
+ if (perf_pmu__hybrid_exist()) {
+ if (evlist__add_default_attrs(evsel_list,
+ default_sw_attrs) < 0) {
+ return -1;
+ }
+ return add_default_hybrid_events(evsel_list);
+ }
+
if (target__has_cpu(&target))
default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;

--
2.7.4

2021-02-08 18:19:58

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 42/49] perf evlist: Create two hybrid 'cycles' events by default

From: Jin Yao <[email protected]>

When evlist is empty, for example no '-e' specified in perf record,
one default 'cycles' event is added to evlist.

While on hybrid platform, it needs to create two default 'cycles'
events. One is for core, the other is for atom.

This patch actually calls evsel__new_cycles() two times to create
two 'cycles' events.

root@otcpl-adl-s-2:~# ./perf record -vv -- sleep 1
...
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0x400000000
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|PERIOD
read_format ID
disabled 1
inherit 1
mmap 1
comm 1
freq 1
enable_on_exec 1
task 1
precise_ip 3
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
------------------------------------------------------------
sys_perf_event_open: pid 11609 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 11609 cpu 1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid 11609 cpu 2 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid 11609 cpu 3 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid 11609 cpu 4 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid 11609 cpu 5 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid 11609 cpu 6 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid 11609 cpu 7 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid 11609 cpu 8 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid 11609 cpu 9 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid 11609 cpu 10 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid 11609 cpu 11 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid 11609 cpu 12 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid 11609 cpu 13 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid 11609 cpu 14 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid 11609 cpu 15 group_fd -1 flags 0x8 = 21
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
{ sample_period, sample_freq } 4000
sample_type IP|TID|TIME|ID|PERIOD
read_format ID
disabled 1
inherit 1
freq 1
enable_on_exec 1
precise_ip 3
sample_id_all 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid 11609 cpu 16 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid 11609 cpu 17 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid 11609 cpu 18 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid 11609 cpu 19 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid 11609 cpu 20 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid 11609 cpu 21 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid 11609 cpu 22 group_fd -1 flags 0x8 = 28
sys_perf_event_open: pid 11609 cpu 23 group_fd -1 flags 0x8 = 29
...
------------------------------------------------------------
perf_event_attr:
type 1
size 120
config 0x9
watermark 1
sample_id_all 1
bpf_event 1
{ wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 30
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 31
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 32
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 33
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 34
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 35
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 36
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 37
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 38
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 39
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 40
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 41
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 42
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 43
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 44
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 45
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 46
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 47
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 48
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 49
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 50
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 51
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 52
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 53
...

We can see one core 'cycles' (0x400000000) is enabled on cpu0-cpu15
and atom 'cycles' (0xa00000000) is enabled on cpu16-cpu23.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/evlist.c | 33 ++++++++++++++++++++++++++++++++-
tools/perf/util/evsel.c | 6 +++---
tools/perf/util/evsel.h | 2 +-
3 files changed, 36 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 626a0e7..8606e82 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -243,10 +243,41 @@ void evlist__set_leader(struct evlist *evlist)
}
}

+static int __evlist__add_hybrid_default(struct evlist *evlist, bool precise)
+{
+ struct evsel *evsel;
+ struct perf_pmu *pmu;
+ __u64 config;
+ struct perf_cpu_map *cpus;
+
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ config = PERF_COUNT_HW_CPU_CYCLES |
+ ((__u64)pmu->type << PERF_PMU_TYPE_SHIFT);
+ evsel = evsel__new_cycles(precise, PERF_TYPE_HARDWARE_PMU,
+ config);
+ if (!evsel)
+ return -ENOMEM;
+
+ cpus = perf_cpu_map__get(pmu->cpus);
+ evsel->core.cpus = cpus;
+ evsel->core.own_cpus = perf_cpu_map__get(cpus);
+ evsel->pmu_name = strdup(pmu->name);
+ evlist__add(evlist, evsel);
+ }
+
+ return 0;
+}
+
int __evlist__add_default(struct evlist *evlist, bool precise)
{
- struct evsel *evsel = evsel__new_cycles(precise);
+ struct evsel *evsel;
+
+ perf_pmu__scan(NULL);
+ if (perf_pmu__hybrid_exist())
+ return __evlist__add_hybrid_default(evlist, precise);

+ evsel = evsel__new_cycles(precise, PERF_TYPE_HARDWARE,
+ PERF_COUNT_HW_CPU_CYCLES);
if (evsel == NULL)
return -ENOMEM;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 24c0b59..61508cf 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -292,11 +292,11 @@ static bool perf_event_can_profile_kernel(void)
return perf_event_paranoid_check(1);
}

-struct evsel *evsel__new_cycles(bool precise)
+struct evsel *evsel__new_cycles(bool precise, __u32 type, __u64 config)
{
struct perf_event_attr attr = {
- .type = PERF_TYPE_HARDWARE,
- .config = PERF_COUNT_HW_CPU_CYCLES,
+ .type = type,
+ .config = config,
.exclude_kernel = !perf_event_can_profile_kernel(),
};
struct evsel *evsel;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4eb054a..aa73442 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -198,7 +198,7 @@ static inline struct evsel *evsel__newtp(const char *sys, const char *name)
return evsel__newtp_idx(sys, name, 0);
}

-struct evsel *evsel__new_cycles(bool precise);
+struct evsel *evsel__new_cycles(bool precise, __u32 type, __u64 config);

struct tep_event *event_format__new(const char *sys, const char *name);

--
2.7.4

2021-02-08 18:21:00

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 25/49] perf/x86/rapl: Add support for Intel Alder Lake

From: Zhang Rui <[email protected]>

Alder Lake RAPL support is the same as previous Sky Lake.
Add Alder Lake model for RAPL.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
---
arch/x86/events/rapl.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index 7dbbeaa..b493459 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -800,6 +800,8 @@ static const struct x86_cpu_id rapl_model_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &model_hsx),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &model_skl),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &model_skl),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &model_spr),
X86_MATCH_VENDOR_FAM(AMD, 0x17, &model_amd_fam17h),
X86_MATCH_VENDOR_FAM(HYGON, 0x18, &model_amd_fam17h),
--
2.7.4

2021-02-08 18:21:42

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 46/49] perf stat: Filter out unmatched aggregation for hybrid event

From: Jin Yao <[email protected]>

perf-stat has supported some aggregation modes, such as --per-core,
--per-socket and etc. While for hybrid event, it may only available
on part of cpus. So for --per-core, we need to filter out the
unavailable cores, for --per-socket, filter out the unavailable
sockets, and so on.

Before:

root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 2 311,114 cycles [cpu_core]
S0-D0-C4 2 59,784 cycles [cpu_core]
S0-D0-C8 2 121,287 cycles [cpu_core]
S0-D0-C12 2 2,690,245 cycles [cpu_core]
S0-D0-C16 2 2,060,545 cycles [cpu_core]
S0-D0-C20 2 3,632,251 cycles [cpu_core]
S0-D0-C24 2 775,736 cycles [cpu_core]
S0-D0-C28 2 742,020 cycles [cpu_core]
S0-D0-C32 0 <not counted> cycles [cpu_core]
S0-D0-C33 0 <not counted> cycles [cpu_core]
S0-D0-C34 0 <not counted> cycles [cpu_core]
S0-D0-C35 0 <not counted> cycles [cpu_core]
S0-D0-C36 0 <not counted> cycles [cpu_core]
S0-D0-C37 0 <not counted> cycles [cpu_core]
S0-D0-C38 0 <not counted> cycles [cpu_core]
S0-D0-C39 0 <not counted> cycles [cpu_core]

1.001779842 seconds time elapsed

After:

root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

Performance counter stats for 'system wide':

S0-D0-C0 2 1,088,230 cycles [cpu_core]
S0-D0-C4 2 57,228 cycles [cpu_core]
S0-D0-C8 2 98,327 cycles [cpu_core]
S0-D0-C12 2 2,741,955 cycles [cpu_core]
S0-D0-C16 2 2,090,432 cycles [cpu_core]
S0-D0-C20 2 3,192,108 cycles [cpu_core]
S0-D0-C24 2 2,910,752 cycles [cpu_core]
S0-D0-C28 2 388,696 cycles [cpu_core]

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/stat-display.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 21a3f80..fa11572 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -630,6 +630,20 @@ static void aggr_cb(struct perf_stat_config *config,
}
}

+static bool aggr_id_hybrid_matched(struct perf_stat_config *config,
+ struct evsel *counter, struct aggr_cpu_id id)
+{
+ struct aggr_cpu_id s;
+
+ for (int i = 0; i < evsel__nr_cpus(counter); i++) {
+ s = config->aggr_get_id(config, evsel__cpus(counter), i);
+ if (cpu_map__compare_aggr_cpu_id(s, id))
+ return true;
+ }
+
+ return false;
+}
+
static void print_counter_aggrdata(struct perf_stat_config *config,
struct evsel *counter, int s,
char *prefix, bool metric_only,
@@ -643,6 +657,12 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
double uval;

ad.id = id = config->aggr_map->map[s];
+
+ if (perf_pmu__hybrid_exist() &&
+ !aggr_id_hybrid_matched(config, counter, id)) {
+ return;
+ }
+
ad.val = ad.ena = ad.run = 0;
ad.nr = 0;
if (!collect_data(config, counter, aggr_cb, &ad))
--
2.7.4

2021-02-08 18:22:05

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 45/49] perf stat: Merge event counts from all hybrid PMUs

From: Jin Yao <[email protected]>

For hybrid events, by default stat aggregates and reports the event counts
per pmu.

root@otcpl-adl-s-2:~# ./perf stat -e cycles -a -- sleep 1

Performance counter stats for 'system wide':

17,291,386 cycles [cpu_core]
1,556,803 cycles [cpu_atom]

1.002154118 seconds time elapsed

Sometime, it's also useful to aggregate event counts from all PMUs.
Create a new option '--hybrid-merge' to enable that behavior and report
the counts without PMUs.

root@otcpl-adl-s-2:~# ./perf stat -e cycles -a --hybrid-merge -- sleep 1

Performance counter stats for 'system wide':

19,041,587 cycles

1.002195329 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 7 +++++++
tools/perf/builtin-stat.c | 3 ++-
tools/perf/util/stat-display.c | 3 ++-
tools/perf/util/stat.h | 1 +
4 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b0e357d..3d083a3 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -418,6 +418,13 @@ Multiple events are created from a single event specification when:
2. Aliases, which are listed immediately after the Kernel PMU events
by perf list, are used.

+--hybrid-merge::
+Merge the hybrid event counts from all PMUs.
+
+For hybrid events, by default stat aggregates and reports the event counts
+per pmu. But sometime, it's also useful to aggregate event counts from all
+PMUs. This option enables that behavior and reports the counts without PMUs.
+
--smi-cost::
Measure SMI cost if msr/aperf/ and msr/smi/ events are supported.

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index bfe7305..d367cfe 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1184,6 +1184,7 @@ static struct option stat_options[] = {
OPT_SET_UINT('A', "no-aggr", &stat_config.aggr_mode,
"disable CPU count aggregation", AGGR_NONE),
OPT_BOOLEAN(0, "no-merge", &stat_config.no_merge, "Do not merge identical named events"),
+ OPT_BOOLEAN(0, "hybrid-merge", &stat_config.hybrid_merge, "Merge identical named hybrid events"),
OPT_STRING('x', "field-separator", &stat_config.csv_sep, "separator",
"print counts with custom separator"),
OPT_CALLBACK('G', "cgroup", &evsel_list, "name",
@@ -2379,7 +2380,7 @@ int cmd_stat(int argc, const char **argv)

evlist__check_cpu_maps(evsel_list);

- if (perf_pmu__hybrid_exist())
+ if (perf_pmu__hybrid_exist() && !stat_config.hybrid_merge)
stat_config.no_merge = true;

/*
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 961d5ac..21a3f80 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -568,6 +568,7 @@ static void collect_all_aliases(struct perf_stat_config *config, struct evsel *c
!strcmp(alias->pmu_name, counter->pmu_name) ||
(evsel__is_hybrid_event(alias) &&
evsel__is_hybrid_event(counter) &&
+ !config->hybrid_merge &&
strcmp(alias->pmu_name, counter->pmu_name)))
break;
alias->merged_stat = true;
@@ -585,7 +586,7 @@ static bool collect_data(struct perf_stat_config *config, struct evsel *counter,
cb(config, counter, data, true);
if (config->no_merge)
uniquify_event_name(counter);
- else if (counter->auto_merge_stats)
+ else if (counter->auto_merge_stats || config->hybrid_merge)
collect_all_aliases(config, counter, cb, data);
return true;
}
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index d85c292..80f6715 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -123,6 +123,7 @@ struct perf_stat_config {
bool ru_display;
bool big_num;
bool no_merge;
+ bool hybrid_merge;
bool walltime_run_table;
bool all_kernel;
bool all_user;
--
2.7.4

2021-02-08 18:22:11

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 31/49] perf stat: Hybrid evsel uses its own cpus

From: Jin Yao <[email protected]>

On hybrid platform, atom events can be only enabled on atom CPUs. Core
events can be only enabled on core CPUs. So for a hybrid event, it can
be only enabled on it's own CPUs.

But what the problem for current perf is, the cpus for evsel
(via PMU sysfs) have been merged to evsel_list->core.all_cpus.
It might be all CPUs.

So we need to figure out one way to let the hybrid event only use it's
own CPUs.

The idea is to create a new evlist__invalidate_all_cpus to invalidate
the evsel_list->core.all_cpus then evlist__for_each_cpu returns cpu -1
for hybrid evsel. If cpu is -1, hybrid evsel will use it's own cpus.

We will see following code piece in patch.

if (cpu == -1 && !evlist->thread_mode)
evsel__enable_cpus(pos);

It lets the event be only enabled on event's own cpus.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-stat.c | 37 ++++++++++++++++++++++--
tools/perf/util/evlist.c | 72 ++++++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/evlist.h | 4 +++
tools/perf/util/evsel.h | 8 ++++++
4 files changed, 115 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index bc84b31..afb8789 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -392,6 +392,18 @@ static int read_counter_cpu(struct evsel *counter, struct timespec *rs, int cpu)
return 0;
}

+static int read_counter_cpus(struct evsel *counter, struct timespec *rs)
+{
+ int cpu, nr_cpus, err = 0;
+ struct perf_cpu_map *cpus = evsel__cpus(counter);
+
+ nr_cpus = cpus ? cpus->nr : 1;
+ for (cpu = 0; cpu < nr_cpus; cpu++)
+ err = read_counter_cpu(counter, rs, cpu);
+
+ return err;
+}
+
static int read_affinity_counters(struct timespec *rs)
{
struct evsel *counter;
@@ -413,8 +425,14 @@ static int read_affinity_counters(struct timespec *rs)
if (evsel__cpu_iter_skip(counter, cpu))
continue;
if (!counter->err) {
- counter->err = read_counter_cpu(counter, rs,
- counter->cpu_iter - 1);
+ if (cpu == -1 && !evsel_list->thread_mode) {
+ counter->err = read_counter_cpus(counter, rs);
+ } else if (evsel_list->thread_mode) {
+ counter->err = read_counter_cpu(counter, rs, 0);
+ } else {
+ counter->err = read_counter_cpu(counter, rs,
+ counter->cpu_iter - 1);
+ }
}
}
}
@@ -747,6 +765,21 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx)
if (group)
evlist__set_leader(evsel_list);

+ /*
+ * On hybrid platform, the cpus for evsel (via PMU sysfs) have been
+ * merged to evsel_list->core.all_cpus. We use evlist__invalidate_all_cpus
+ * to invalidate the evsel_list->core.all_cpus then evlist__for_each_cpu
+ * returns cpu -1 for hybrid evsel. If cpu is -1, hybrid evsel will
+ * use it's own cpus.
+ */
+ if (evlist__has_hybrid_events(evsel_list)) {
+ evlist__invalidate_all_cpus(evsel_list);
+ if (!target__has_cpu(&target) ||
+ target__has_per_thread(&target)) {
+ evsel_list->thread_mode = true;
+ }
+ }
+
if (affinity__setup(&affinity) < 0)
return -1;

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 05363a7..626a0e7 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -375,7 +375,8 @@ bool evsel__cpu_iter_skip_no_inc(struct evsel *ev, int cpu)
bool evsel__cpu_iter_skip(struct evsel *ev, int cpu)
{
if (!evsel__cpu_iter_skip_no_inc(ev, cpu)) {
- ev->cpu_iter++;
+ if (cpu != -1)
+ ev->cpu_iter++;
return false;
}
return true;
@@ -404,6 +405,16 @@ static int evlist__is_enabled(struct evlist *evlist)
return false;
}

+static void evsel__disable_cpus(struct evsel *evsel)
+{
+ int cpu, nr_cpus;
+ struct perf_cpu_map *cpus = evsel__cpus(evsel);
+
+ nr_cpus = cpus ? cpus->nr : 1;
+ for (cpu = 0; cpu < nr_cpus; cpu++)
+ evsel__disable_cpu(evsel, cpu);
+}
+
static void __evlist__disable(struct evlist *evlist, char *evsel_name)
{
struct evsel *pos;
@@ -430,7 +441,12 @@ static void __evlist__disable(struct evlist *evlist, char *evsel_name)
has_imm = true;
if (pos->immediate != imm)
continue;
- evsel__disable_cpu(pos, pos->cpu_iter - 1);
+ if (cpu == -1 && !evlist->thread_mode)
+ evsel__disable_cpus(pos);
+ else if (evlist->thread_mode)
+ evsel__disable_cpu(pos, 0);
+ else
+ evsel__disable_cpu(pos, pos->cpu_iter - 1);
}
}
if (!has_imm)
@@ -466,6 +482,15 @@ void evlist__disable_evsel(struct evlist *evlist, char *evsel_name)
__evlist__disable(evlist, evsel_name);
}

+static void evsel__enable_cpus(struct evsel *evsel)
+{
+ int cpu, nr_cpus;
+ struct perf_cpu_map *cpus = evsel__cpus(evsel);
+
+ nr_cpus = cpus ? cpus->nr : 1;
+ for (cpu = 0; cpu < nr_cpus; cpu++)
+ evsel__enable_cpu(evsel, cpu);
+}
static void __evlist__enable(struct evlist *evlist, char *evsel_name)
{
struct evsel *pos;
@@ -485,7 +510,12 @@ static void __evlist__enable(struct evlist *evlist, char *evsel_name)
continue;
if (!evsel__is_group_leader(pos) || !pos->core.fd)
continue;
- evsel__enable_cpu(pos, pos->cpu_iter - 1);
+ if (cpu == -1 && !evlist->thread_mode)
+ evsel__enable_cpus(pos);
+ else if (evlist->thread_mode)
+ evsel__enable_cpu(pos, 0);
+ else
+ evsel__enable_cpu(pos, pos->cpu_iter - 1);
}
}
affinity__cleanup(&affinity);
@@ -1260,6 +1290,16 @@ void evlist__set_selected(struct evlist *evlist, struct evsel *evsel)
evlist->selected = evsel;
}

+static void evsel__close_cpus(struct evsel *evsel)
+{
+ int cpu, nr_cpus;
+ struct perf_cpu_map *cpus = evsel__cpus(evsel);
+
+ nr_cpus = cpus ? cpus->nr : 1;
+ for (cpu = 0; cpu < nr_cpus; cpu++)
+ perf_evsel__close_cpu(&evsel->core, cpu);
+}
+
void evlist__close(struct evlist *evlist)
{
struct evsel *evsel;
@@ -1284,7 +1324,13 @@ void evlist__close(struct evlist *evlist)
evlist__for_each_entry_reverse(evlist, evsel) {
if (evsel__cpu_iter_skip(evsel, cpu))
continue;
- perf_evsel__close_cpu(&evsel->core, evsel->cpu_iter - 1);
+
+ if (cpu == -1 && !evlist->thread_mode)
+ evsel__close_cpus(evsel);
+ else if (evlist->thread_mode)
+ perf_evsel__close_cpu(&evsel->core, 0);
+ else
+ perf_evsel__close_cpu(&evsel->core, evsel->cpu_iter - 1);
}
}
affinity__cleanup(&affinity);
@@ -2010,3 +2056,21 @@ struct evsel *evlist__find_evsel(struct evlist *evlist, int idx)
}
return NULL;
}
+
+bool evlist__has_hybrid_events(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel__is_hybrid_event(evsel))
+ return true;
+ }
+
+ return false;
+}
+
+void evlist__invalidate_all_cpus(struct evlist *evlist)
+{
+ perf_cpu_map__put(evlist->core.all_cpus);
+ evlist->core.all_cpus = perf_cpu_map__empty_new(1);
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 1aae758..9741df4 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -52,6 +52,7 @@ struct evlist {
struct perf_evlist core;
int nr_groups;
bool enabled;
+ bool thread_mode;
int id_pos;
int is_pos;
u64 combined_sample_type;
@@ -353,4 +354,7 @@ int evlist__ctlfd_ack(struct evlist *evlist);
#define EVLIST_DISABLED_MSG "Events disabled\n"

struct evsel *evlist__find_evsel(struct evlist *evlist, int idx);
+void evlist__invalidate_all_cpus(struct evlist *evlist);
+
+bool evlist__has_hybrid_events(struct evlist *evlist);
#endif /* __PERF_EVLIST_H */
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 5c161a2..4eb054a 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -7,9 +7,11 @@
#include <sys/types.h>
#include <linux/perf_event.h>
#include <linux/types.h>
+#include <string.h>
#include <internal/evsel.h>
#include <perf/evsel.h>
#include "symbol_conf.h"
+#include "pmu.h"
#include <internal/cpumap.h>

struct bpf_object;
@@ -427,4 +429,10 @@ static inline bool evsel__is_dummy_event(struct evsel *evsel)
struct perf_env *evsel__env(struct evsel *evsel);

int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
+
+static inline bool evsel__is_hybrid_event(struct evsel *evsel)
+{
+ return evsel->pmu_name && perf_pmu__is_hybrid(evsel->pmu_name);
+}
+
#endif /* __PERF_EVSEL_H */
--
2.7.4

2021-02-08 18:22:48

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 27/49] perf util: Save pmu name to struct perf_pmu_alias

From: Jin Yao <[email protected]>

On hybrid platform, one event is available on one pmu
(such as, cpu_core or cpu_atom).

This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
Then next we can know the pmu where the event can be enabled.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/pmu.c | 17 +++++++++++++----
tools/perf/util/pmu.h | 1 +
2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 44ef283..0c25457 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -283,6 +283,7 @@ void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
zfree(&newalias->str);
zfree(&newalias->metric_expr);
zfree(&newalias->metric_name);
+ zfree(&newalias->pmu);
parse_events_terms__purge(&newalias->terms);
free(newalias);
}
@@ -297,6 +298,10 @@ static bool perf_pmu_merge_alias(struct perf_pmu_alias *newalias,

list_for_each_entry(a, alist, list) {
if (!strcasecmp(newalias->name, a->name)) {
+ if (newalias->pmu && a->pmu &&
+ !strcasecmp(newalias->pmu, a->pmu)) {
+ continue;
+ }
perf_pmu_update_alias(a, newalias);
perf_pmu_free_alias(newalias);
return true;
@@ -311,7 +316,8 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
char *unit, char *perpkg,
char *metric_expr,
char *metric_name,
- char *deprecated)
+ char *deprecated,
+ char *pmu)
{
struct parse_events_term *term;
struct perf_pmu_alias *alias;
@@ -382,6 +388,7 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
}
alias->per_pkg = perpkg && sscanf(perpkg, "%d", &num) == 1 && num == 1;
alias->str = strdup(newval);
+ alias->pmu = pmu ? strdup(pmu) : NULL;

if (deprecated)
alias->deprecated = true;
@@ -407,7 +414,7 @@ static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FI
strim(buf);

return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL, NULL,
- NULL, NULL, NULL, NULL);
+ NULL, NULL, NULL, NULL, NULL);
}

static inline bool pmu_alias_info_file(char *name)
@@ -797,7 +804,8 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
(char *)pe->unit, (char *)pe->perpkg,
(char *)pe->metric_expr,
(char *)pe->metric_name,
- (char *)pe->deprecated);
+ (char *)pe->deprecated,
+ (char *)pe->pmu);
}
}

@@ -870,7 +878,8 @@ static int pmu_add_sys_aliases_iter_fn(struct pmu_event *pe, void *data)
(char *)pe->perpkg,
(char *)pe->metric_expr,
(char *)pe->metric_name,
- (char *)pe->deprecated);
+ (char *)pe->deprecated,
+ NULL);
}

return 0;
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 8164388..0e724d5 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -72,6 +72,7 @@ struct perf_pmu_alias {
bool deprecated;
char *metric_expr;
char *metric_name;
+ char *pmu;
};

struct perf_pmu *perf_pmu__find(const char *name);
--
2.7.4

2021-02-08 18:22:57

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 29/49] perf pmu: Add hybrid helper functions

From: Jin Yao <[email protected]>

The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
can be used to identify the hybrid platform and return the found
hybrid cpu pmu. All the detected hybrid pmus have been saved in
'perf_pmu__hybrid_pmus' list. So we just need to search this list
for a pmu name.

perf_pmu__hybrid_type_to_pmu converts the user specified string
to hybrid pmu name. This is used to support the '--cputype' option
in next patches.

perf_pmu__hybrid_exist checks if hybrid pmus exist.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/pmu.c | 40 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/pmu.h | 11 +++++++++++
2 files changed, 51 insertions(+)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index e97b121..04447f5 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1842,3 +1842,43 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu)

return nr_caps;
}
+
+struct perf_pmu *perf_pmu__find_hybrid_pmu(const char *name)
+{
+ struct perf_pmu *pmu;
+
+ if (!name)
+ return NULL;
+
+ perf_pmu__for_each_hybrid_pmus(pmu) {
+ if (!strcmp(name, pmu->name))
+ return pmu;
+ }
+
+ return NULL;
+}
+
+bool perf_pmu__is_hybrid(const char *name)
+{
+ return perf_pmu__find_hybrid_pmu(name) != NULL;
+}
+
+char *perf_pmu__hybrid_type_to_pmu(const char *type)
+{
+ char *pmu_name = NULL;
+
+ if (asprintf(&pmu_name, "cpu_%s", type) < 0)
+ return NULL;
+
+ if (perf_pmu__is_hybrid(pmu_name))
+ return pmu_name;
+
+ /*
+ * The pmus may be not scanned yet, so check the sysfs.
+ */
+ if (pmu_is_hybrid(pmu_name))
+ return pmu_name;
+
+ free(pmu_name);
+ return NULL;;
+}
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 99bdb5d..bb74595 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -131,4 +131,15 @@ int perf_pmu__caps_parse(struct perf_pmu *pmu);
#define perf_pmu__for_each_hybrid_pmus(pmu) \
list_for_each_entry(pmu, &perf_pmu__hybrid_pmus, hybrid_list)

+struct perf_pmu *perf_pmu__find_hybrid_pmu(const char *name);
+
+bool perf_pmu__is_hybrid(const char *name);
+
+char *perf_pmu__hybrid_type_to_pmu(const char *type);
+
+static inline bool perf_pmu__hybrid_exist(void)
+{
+ return !list_empty(&perf_pmu__hybrid_pmus);
+}
+
#endif /* __PMU_H */
--
2.7.4

2021-02-08 18:23:27

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 26/49] perf jevents: Support unit value "cpu_core" and "cpu_atom"

From: Jin Yao <[email protected]>

For some Intel platforms, such as Alderlake, which is a hybrid platform
and it consists of atom cpu and core cpu. Each cpu has dedicated event
list. Part of events are available on core cpu, part of events are
available on atom cpu.

The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in
json is added with a new field "Unit" to indicate which pmu the event
is available on.

For example, one event in cache.json,

{
"BriefDescription": "Counts the number of load ops retired that",
"CollectPEBSRecord": "2",
"Counter": "0,1,2,3",
"EventCode": "0xd2",
"EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO",
"PEBScounters": "0,1,2,3",
"SampleAfterValue": "1000003",
"UMask": "0x80",
"Unit": "cpu_atom"
},

The unit "cpu_atom" indicates this event is only available on "cpu_atom".

In generated pmu-events.c, we can see:

{
.name = "mem_load_uops_retired_misc.mmio",
.event = "period=1000003,umask=0x80,event=0xd2",
.desc = "Counts the number of load ops retired that. Unit: cpu_atom ",
.topic = "cache",
.pmu = "cpu_atom",
},

But if without this patch, the "uncore_" prefix is added by the internal
JSON file parser.
.pmu = "uncore_cpu_atom"

That would be a wrong pmu.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/pmu-events/jevents.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index e1f3f5c..b1a15f5 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -285,6 +285,8 @@ static struct map {
{ "imx8_ddr", "imx8_ddr" },
{ "L3PMC", "amd_l3" },
{ "DFPMC", "amd_df" },
+ { "cpu_core", "cpu_core" },
+ { "cpu_atom", "cpu_atom" },
{}
};

--
2.7.4

2021-02-08 18:23:35

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 47/49] perf evlist: Warn as events from different hybrid PMUs in a group

From: Jin Yao <[email protected]>

If a group has events which are from different hybrid PMUs,
shows a warning.

This is to remind the user not to put the core event and atom
event into one group.

root@otcpl-adl-s-2:~# ./perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -- sleep 1
WARNING: Group has events from different hybrid PMUs

Performance counter stats for 'sleep 1':

<not counted> cycles [cpu_core]
<not supported> cycles [cpu_atom]

1.001591674 seconds time elapsed

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/builtin-record.c | 3 +++
tools/perf/builtin-stat.c | 7 +++++++
tools/perf/util/evlist.c | 43 +++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/evlist.h | 2 ++
4 files changed, 55 insertions(+)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index fd39116..cfc1b90 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -926,6 +926,9 @@ static int record__open(struct record *rec)
pos = evlist__reset_weak_group(evlist, pos, true);
goto try_again;
}
+
+ if (errno == EINVAL && perf_pmu__hybrid_exist())
+ evlist__warn_hybrid_group(evlist);
rc = -errno;
evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
ui__error("%s\n", msg);
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index d367cfe..87a5f44 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -238,6 +238,9 @@ static void evlist__check_cpu_maps(struct evlist *evlist)
struct evsel *evsel, *pos, *leader;
char buf[1024];

+ if (evlist__hybrid_exist(evlist))
+ return;
+
evlist__for_each_entry(evlist, evsel) {
leader = evsel->leader;

@@ -692,6 +695,10 @@ enum counter_recovery {
static enum counter_recovery stat_handle_error(struct evsel *counter)
{
char msg[BUFSIZ];
+
+ if (perf_pmu__hybrid_exist() && errno == EINVAL)
+ evlist__warn_hybrid_group(evsel_list);
+
/*
* PPC returns ENXIO for HW counters until 2.6.37
* (behavior changed with commit b0a873e).
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8606e82..3bdff5c 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -2105,3 +2105,46 @@ void evlist__invalidate_all_cpus(struct evlist *evlist)
perf_cpu_map__put(evlist->core.all_cpus);
evlist->core.all_cpus = perf_cpu_map__empty_new(1);
}
+
+static bool group_hybrid_conflict(struct evsel *leader)
+{
+ struct evsel *pos, *prev = NULL;
+
+ for_each_group_evsel(pos, leader) {
+ if (!pos->pmu_name || !perf_pmu__is_hybrid(pos->pmu_name))
+ continue;
+
+ if (prev && strcmp(prev->pmu_name, pos->pmu_name))
+ return true;
+
+ prev = pos;
+ }
+
+ return false;
+}
+
+void evlist__warn_hybrid_group(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel__is_group_event(evsel) &&
+ group_hybrid_conflict(evsel)) {
+ WARN_ONCE(1, "WARNING: Group has events from "
+ "different hybrid PMUs\n");
+ return;
+ }
+ }
+}
+
+bool evlist__hybrid_exist(struct evlist *evlist)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel__is_hybrid_event(evsel))
+ return true;
+ }
+
+ return false;
+}
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index c06b9ff..55c944b 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -358,4 +358,6 @@ struct evsel *evlist__find_evsel(struct evlist *evlist, int idx);
void evlist__invalidate_all_cpus(struct evlist *evlist);

bool evlist__has_hybrid_events(struct evlist *evlist);
+void evlist__warn_hybrid_group(struct evlist *evlist);
+bool evlist__hybrid_exist(struct evlist *evlist);
#endif /* __PERF_EVLIST_H */
--
2.7.4

2021-02-08 18:24:22

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 48/49] perf Documentation: Document intel-hybrid support

From: Jin Yao <[email protected]>

Add some words and examples to help understanding of
Intel hybrid perf support.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/Documentation/intel-hybrid.txt | 335 ++++++++++++++++++++++++++++++
tools/perf/Documentation/perf-record.txt | 1 +
tools/perf/Documentation/perf-stat.txt | 2 +
3 files changed, 338 insertions(+)
create mode 100644 tools/perf/Documentation/intel-hybrid.txt

diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt
new file mode 100644
index 0000000..bdf38d0
--- /dev/null
+++ b/tools/perf/Documentation/intel-hybrid.txt
@@ -0,0 +1,335 @@
+Intel hybrid support
+--------------------
+Support for Intel hybrid events within perf tools.
+
+For some Intel platforms, such as AlderLake, which is hybrid platform and
+it consists of atom cpu and core cpu. Each cpu has dedicated event list.
+Part of events are available on core cpu, part of events are available
+on atom cpu and even part of events are available on both.
+
+Kernel exports two new cpu pmus via sysfs:
+/sys/devices/cpu_core
+/sys/devices/cpu_atom
+
+The 'cpus' files are created under the directories. For example,
+
+cat /sys/devices/cpu_core/cpus
+0-15
+
+cat /sys/devices/cpu_atom/cpus
+16-23
+
+It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus.
+
+Quickstart
+
+List hybrid event
+-----------------
+
+As before, use perf-list to list the symbolic event.
+
+perf list
+
+inst_retired.any
+ [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
+inst_retired.any
+ [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core]
+
+The 'Unit: xxx' is added to brief description to indicate which pmu
+the event is belong to. Same event name but with different pmu can
+be supported.
+
+Use '--cputype' option to list core only event or atom only event.
+
+perf list --cputype atom
+
+inst_retired.any
+ [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom]
+
+Enable hybrid event with a specific pmu
+---------------------------------------
+
+To enable a core only event or atom only event, following syntax is supported:
+
+ cpu_core/<event name>/
+or
+ cpu_atom/<event name>/
+or
+ --cputype core -e <event name>
+or
+ --cputype atom -e <event name>
+
+For example, count the 'cycles' event on core cpus.
+
+ perf stat -e cpu_core/cycles/
+or
+ perf stat --cputype core -e cycles
+
+If '--cputype' value conflicts with pmu prefix, '--cputype' is ignored and
+a warning will be displayed.
+
+Create two events for one hardware event automatically
+------------------------------------------------------
+
+When creating one event and the event is available on both atom and core,
+two events are created automatically. One is for atom, the other is for
+core. Most of hardware events and cache events are available on both
+cpu_core and cpu_atom.
+
+For hardware events, they have pre-defined configs (e.g. 0 for cycles).
+But on hybrid platform, kernel needs to know where the event comes from
+(from atom or from core). The original perf event type PERF_TYPE_HARDWARE
+can't carry pmu information. So a new type PERF_TYPE_HARDWARE_PMU is
+introduced.
+
+The new attr.config layout for PERF_TYPE_HARDWARE_PMU:
+
+0xDD000000AA
+AA: original hardware event ID
+DD: PMU type ID
+
+Cache event is similar. A new type PERF_TYPE_HW_CACHE_PMU is introduced.
+
+The new attr.config layout for PERF_TYPE_HW_CACHE_PMU:
+
+0xDD00CCBBAA
+AA: original hardware cache ID
+BB: original hardware cache op ID
+CC: original hardware cache op result ID
+DD: PMU type ID
+
+PMU type ID is retrieved from sysfs
+
+cat /sys/devices/cpu_atom/type
+10
+
+cat /sys/devices/cpu_core/type
+4
+
+When enabling a hardware event without specified pmu, such as,
+perf stat -e cycles -a (use system-wide in this example), two events
+are created automatically.
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ type 6
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+and
+
+ ------------------------------------------------------------
+ perf_event_attr:
+ type 6
+ size 120
+ config 0xa00000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ exclude_guest 1
+ ------------------------------------------------------------
+
+type 6 is PERF_TYPE_HARDWARE_PMU.
+0x4 in 0x400000000 indicates it's cpu_core pmu.
+0xa in 0xa00000000 indicates it's cpu_atom pmu (atom pmu type id is random).
+
+The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus),
+and create 'cycles' (0xa00000000) on cpu16-cpu23 (atom cpus).
+
+For perf-stat result, it displays two events:
+
+ Performance counter stats for 'system wide':
+
+ 14,240,632 cycles
+ 1,556,789 cycles
+
+The first 'cycles' is core event, the second 'cycles' is atom event.
+
+In event list, for the events with same event name, the first one is core
+event, and the second one is atom event.
+
+Thread mode example:
+--------------------
+
+perf-stat reports the scaled counts for hybrid event and with a percentage
+displayed. The percentage is the event's running time/enabling time.
+
+One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
+scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
+
+perf stat -e cycles -- taskset -c 16 ./triad_loop
+
+As previous, two events are created.
+
+------------------------------------------------------------
+perf_event_attr:
+ type 6
+ size 120
+ config 0x400000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+and
+
+------------------------------------------------------------
+perf_event_attr:
+ type 6
+ size 120
+ config 0xa00000000
+ sample_type IDENTIFIER
+ read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
+ disabled 1
+ inherit 1
+ enable_on_exec 1
+ exclude_guest 1
+------------------------------------------------------------
+
+cycles: 0: 751035 206505423 966647
+cycles: 0: 601381075 206505423 205538776
+cycles: 160444092 206505423 966647
+cycles: 604209364 206505423 205538776
+
+ Performance counter stats for 'taskset -c 16 ./triad_loop':
+
+ 160,444,092 cycles (0.47%)
+ 604,209,364 cycles (99.53%)
+
+ 0.207494637 seconds time elapsed
+
+We can use '--no-scale' to see the original data.
+
+perf stat -e cycles --no-scale -- taskset -c 16 ./triad_loop
+
+cycles: 0: 755604 206642115 956961
+cycles: 0: 601433846 206642115 205685154
+cycles: 755604 206642115 956961
+cycles: 601433846 206642115 205685154
+
+ Performance counter stats for 'taskset -c 16 ./triad_loop':
+
+ 755,604 cycles (0.46%)
+ 601,433,846 cycles (99.54%)
+
+Suppert metrics with hybrid events:
+-----------------------------------
+
+For hybrid platform, the same metric is probably made of the events
+with different pmus.
+
+For example "CPI".
+
+Atom:
+CPI = cpu_clk_unhalted.core / inst_retired.any_p
+
+Core:
+CPI = cpu_clk_unhalted.thread / inst_retired.any
+
+inst_retired.any_p and inst_retired.any are available on both core
+and atom CPU. But cpu_clk_unhalted.core is only available on atom and
+cpu_clk_unhalted.thread is only available on core CPU.
+
+The metric group string is expanded to:
+"{inst_retired.any,cpu_clk_unhalted.thread}:W,{cpu_clk_unhalted.core,inst_retired.any_p}:W".
+
+"{inst_retired.any,cpu_clk_unhalted.thread}:W" applies CPI on core.
+"{cpu_clk_unhalted.core,inst_retired.any_p}:W" applies CPI on atom.
+
+The difficulty is "inst_retired.any_p" and "inst_retired.any"
+can be available on both. So add pmu suffixes "cpu_atom"/"cpu_core" to
+limit the event group on a specified PMU.
+
+"{inst_retired.any,cpu_clk_unhalted.thread}:W#cpu_core,{cpu_clk_unhalted.core,inst_retired.any_p}:W#cpu_atom"
+
+That means the group "{inst_retired.any,cpu_clk_unhalted.thread}:W" is only
+enabled on core and "{cpu_clk_unhalted.core,inst_retired.any_p}:W" is only
+enabled on atom.
+
+For perf-stat result, it reports two 'CPI'.
+
+perf stat -M CPI -a -- sleep 1
+
+ Performance counter stats for 'system wide':
+
+ 3,654,776 inst_retired.any # 4.76 CPI
+ 17,409,944 cpu_clk_unhalted.thread
+ 771,568 cpu_clk_unhalted.core # 6.83 CPI
+ 112,943 inst_retired.any_p
+
+First CPI is for core, the second CPI is for atom.
+Of course, we can get core CPI only by enabling '--cputype' option.
+
+perf stat --cputype core -M CPI --no-merge -a -- sleep 1
+
+ Performance counter stats for 'system wide':
+
+ 2,496,815 inst_retired.any [cpu_core] # 3.66 CPI
+ 9,149,133 cpu_clk_unhalted.thread [cpu_core]
+
+perf-record:
+------------
+
+If there is no '-e' specified in perf record, on hybrid platform,
+it creates two default 'cycles' and adds them to event list. One
+is for core, the other is for atom.
+
+Support the '--cputype' option for perf-record. It can only enable
+the events on specific cpus with specified pmu type.
+
+perf record --cputype core
+
+Only the core 'cycles' are enabled on core cpus.
+
+perf-stat:
+----------
+
+If there is no '-e' specified in perf stat, on hybrid platform,
+besides of software events, following events are created and
+added to event list in order.
+
+core 'cycles',
+atom 'cycles',
+core 'instructions',
+atom 'instructions',
+core 'branches',
+atom 'branches',
+core 'branch-misses',
+atom 'branch-misses'
+
+Support the '--cputype' option for perf-stat. It can only enable
+the events on specific cpus with specified pmu type.
+
+perf stat --cputype core
+
+Only the core events are enabled on core cpus.
+
+core 'cycles',
+core 'instructions',
+core 'branches',
+core 'branch-misses',
+
+Of course, both perf-stat and perf-record support to enable
+hybrid event with a specific pmu.
+
+e.g.
+perf stat -e cpu_core/cycles/
+perf stat -e cpu_atom/cycles/
+perf stat -e cpu_core/r1a/
+perf stat -e cpu_atom/L1-icache-loads/
+perf stat -e cpu_core/cycles/,cpu_atom/instructions/
+perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}'
+
+But '{cpu_core/cycles/,cpu_atom/instructions/}' will return
+"<not supported>" for 'instructions', because the pmus in
+group are not matched (cpu_core vs. cpu_atom).
\ No newline at end of file
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 34cf651..6fc01f0 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -680,6 +680,7 @@ measurements:
wait -n ${perf_pid}
exit $?

+include::intel-hybrid.txt[]

SEE ALSO
--------
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 3d083a3..43bdeaa 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -520,6 +520,8 @@ The fields are in this order:

Additional metrics may be printed with all earlier fields being empty.

+include::intel-hybrid.txt[]
+
SEE ALSO
--------
linkperf:perf-top[1], linkperf:perf-list[1]
--
2.7.4

2021-02-08 18:25:19

by Liang, Kan

[permalink] [raw]
Subject: [PATCH 49/49] perf evsel: Adjust hybrid event and global event mixed group

From: Jin Yao <[email protected]>

A group mixed with hybrid event and global event is allowed. For example,
group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.

e.g.
perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a

The challenge is their available cpus are not fully matched.
For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
is available on CPU16-CPU23.

When getting the group id for group member, we must be very careful
because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.

Another challenge is for group read. The events in group may be not
available on all cpus. For example the leader is a software event and
it's available on CPU0-CPU1, but the group member is a hybrid event and
it's only available on CPU1. For CPU0, we have only one event, but for CPU1
we have two events. So we need to change the read size according to
the real number of events on that cpu.

Let's see examples,

root@otcpl-adl-s-2:~# ./perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a -vvv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 1
size 120
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd 20 flags 0x8 = 28
sys_perf_event_open: pid -1 cpu 17 group_fd 21 flags 0x8 = 29
sys_perf_event_open: pid -1 cpu 18 group_fd 22 flags 0x8 = 30
sys_perf_event_open: pid -1 cpu 19 group_fd 23 flags 0x8 = 31
sys_perf_event_open: pid -1 cpu 20 group_fd 24 flags 0x8 = 32
sys_perf_event_open: pid -1 cpu 21 group_fd 25 flags 0x8 = 33
sys_perf_event_open: pid -1 cpu 22 group_fd 26 flags 0x8 = 34
sys_perf_event_open: pid -1 cpu 23 group_fd 27 flags 0x8 = 35
cpu-clock: 0: 1001661765 1001663044 1001663044
cpu-clock: 1: 1001659407 1001659885 1001659885
cpu-clock: 2: 1001646087 1001647302 1001647302
cpu-clock: 3: 1001645168 1001645550 1001645550
cpu-clock: 4: 1001645052 1001646102 1001646102
cpu-clock: 5: 1001643719 1001644472 1001644472
cpu-clock: 6: 1001641893 1001642859 1001642859
cpu-clock: 7: 1001640524 1001641036 1001641036
cpu-clock: 8: 1001637596 1001638076 1001638076
cpu-clock: 9: 1001638121 1001638200 1001638200
cpu-clock: 10: 1001635825 1001636915 1001636915
cpu-clock: 11: 1001633722 1001634276 1001634276
cpu-clock: 12: 1001687133 1001686941 1001686941
cpu-clock: 13: 1001693663 1001693317 1001693317
cpu-clock: 14: 1001693381 1001694407 1001694407
cpu-clock: 15: 1001691865 1001692321 1001692321
cpu-clock: 16: 1001696621 1001696550 1001696550
cpu-clock: 17: 1001699963 1001699822 1001699822
cpu-clock: 18: 1001701938 1001701850 1001701850
cpu-clock: 19: 1001699298 1001699214 1001699214
cpu-clock: 20: 1001691550 1001691026 1001691026
cpu-clock: 21: 1001688348 1001688212 1001688212
cpu-clock: 22: 1001684907 1001684799 1001684799
cpu-clock: 23: 1001680840 1001680780 1001680780
cycles: 0: 28175 1001696550 1001696550
cycles: 1: 403323 1001699822 1001699822
cycles: 2: 35905 1001701850 1001701850
cycles: 3: 36755 1001699214 1001699214
cycles: 4: 33757 1001691026 1001691026
cycles: 5: 37146 1001688212 1001688212
cycles: 6: 35483 1001684799 1001684799
cycles: 7: 38600 1001680780 1001680780
cpu-clock: 24040038386 24040046956 24040046956
cycles: 649144 8013542253 8013542253

Performance counter stats for 'system wide':

24,040.04 msec cpu-clock # 23.976 CPUs utilized
649,144 cycles [cpu_atom] # 0.027 M/sec

1.002683706 seconds time elapsed

For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
cpu_atom/cycles/ has 8 cpus aggregation. That's expected.

But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
there leaves more works to do.

root@otcpl-adl-s-2:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
type 1
size 120
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid -1 cpu 16 group_fd 3 flags 0x8 = 28
sys_perf_event_open: pid -1 cpu 17 group_fd 4 flags 0x8 = 29
sys_perf_event_open: pid -1 cpu 18 group_fd 5 flags 0x8 = 30
sys_perf_event_open: pid -1 cpu 19 group_fd 7 flags 0x8 = 31
sys_perf_event_open: pid -1 cpu 20 group_fd 8 flags 0x8 = 32
sys_perf_event_open: pid -1 cpu 21 group_fd 9 flags 0x8 = 33
sys_perf_event_open: pid -1 cpu 22 group_fd 10 flags 0x8 = 34
sys_perf_event_open: pid -1 cpu 23 group_fd 11 flags 0x8 = 35
cycles: 0: 422260 1001993637 1001993637
cycles: 1: 631309 1002039934 1002039934
cycles: 2: 309501 1002018065 1002018065
cycles: 3: 119279 1002040811 1002040811
cycles: 4: 89389 1002039312 1002039312
cycles: 5: 155437 1002054794 1002054794
cycles: 6: 92420 1002051141 1002051141
cycles: 7: 96017 1002073659 1002073659
cpu-clock: 0: 0 0 0
cpu-clock: 1: 0 0 0
cpu-clock: 2: 0 0 0
cpu-clock: 3: 0 0 0
cpu-clock: 4: 0 0 0
cpu-clock: 5: 0 0 0
cpu-clock: 6: 0 0 0
cpu-clock: 7: 0 0 0
cpu-clock: 8: 0 0 0
cpu-clock: 9: 0 0 0
cpu-clock: 10: 0 0 0
cpu-clock: 11: 0 0 0
cpu-clock: 12: 0 0 0
cpu-clock: 13: 0 0 0
cpu-clock: 14: 0 0 0
cpu-clock: 15: 0 0 0
cpu-clock: 16: 1001997706 1001993637 1001993637
cpu-clock: 17: 1002040524 1002039934 1002039934
cpu-clock: 18: 1002018570 1002018065 1002018065
cpu-clock: 19: 1002041360 1002040811 1002040811
cpu-clock: 20: 1002044731 1002039312 1002039312
cpu-clock: 21: 1002055355 1002054794 1002054794
cpu-clock: 22: 1002051659 1002051141 1002051141
cpu-clock: 23: 1002074150 1002073659 1002073659
cycles: 1915612 8016311353 8016311353
cpu-clock: 8016324055 8016311353 8016311353

Performance counter stats for 'system wide':

1,915,612 cycles [cpu_atom] # 0.239 M/sec
8,016.32 msec cpu-clock # 7.996 CPUs utilized

1.002545027 seconds time elapsed

For cpu-clock, cpu16-cpu23 are set with valid group fd (cpu_atom/cycles/'s
fd on that cpu). For counting results, cpu_atom/cycles/ has 8 cpus aggregation
, that's correct. But for cpu-clock, it also has 8 cpus aggregation
(cpu16-cpu23, not all cpus), the code should be improved. Now one warning
is displayed: "WARNING: for cpu-clock, some CPU counts not read".

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Jin Yao <[email protected]>
---
tools/perf/util/evsel.c | 105 +++++++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/stat.h | 1 +
2 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 61508cf..65c8cfc8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1453,15 +1453,26 @@ static void evsel__set_count(struct evsel *counter, int cpu, int thread, u64 val
perf_counts__set_loaded(counter->counts, cpu, thread, true);
}

-static int evsel__process_group_data(struct evsel *leader, int cpu, int thread, u64 *data)
+static int evsel_cpuid_match(struct evsel *evsel1, struct evsel *evsel2,
+ int cpu)
+{
+ int cpuid;
+
+ cpuid = perf_cpu_map__cpu(evsel1->core.cpus, cpu);
+ return perf_cpu_map__idx(evsel2->core.cpus, cpuid);
+}
+
+static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
+ u64 *data, int nr_members)
{
u64 read_format = leader->core.attr.read_format;
struct sample_read_value *v;
u64 nr, ena = 0, run = 0, i;
+ int idx;

nr = *data++;

- if (nr != (u64) leader->core.nr_members)
+ if (nr != (u64) nr_members)
return -EINVAL;

if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -1481,24 +1492,85 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
if (!counter)
return -EINVAL;

- evsel__set_count(counter, cpu, thread, v[i].value, ena, run);
+ if (evsel__is_hybrid_event(counter) ||
+ evsel__is_hybrid_event(leader)) {
+ idx = evsel_cpuid_match(leader, counter, cpu);
+ if (idx == -1)
+ return -EINVAL;
+ } else
+ idx = cpu;
+
+ evsel__set_count(counter, idx, thread, v[i].value, ena, run);
}

return 0;
}

+static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
+{
+ struct evsel *pos;
+ int nr = 1, back, new_size = 0, idx;
+
+ for_each_group_member(pos, leader) {
+ idx = evsel_cpuid_match(leader, pos, cpu);
+ if (idx != -1)
+ nr++;
+ }
+
+ if (nr != leader->core.nr_members) {
+ back = leader->core.nr_members;
+ leader->core.nr_members = nr;
+ new_size = perf_evsel__read_size(&leader->core);
+ leader->core.nr_members = back;
+ }
+
+ *nr_members = nr;
+ return new_size;
+}
+
static int evsel__read_group(struct evsel *leader, int cpu, int thread)
{
struct perf_stat_evsel *ps = leader->stats;
u64 read_format = leader->core.attr.read_format;
int size = perf_evsel__read_size(&leader->core);
+ int new_size, nr_members;
u64 *data = ps->group_data;

if (!(read_format & PERF_FORMAT_ID))
return -EINVAL;

- if (!evsel__is_group_leader(leader))
+ if (!evsel__is_group_leader(leader)) {
+ if (evsel__is_hybrid_event(leader->leader) &&
+ !evsel__is_hybrid_event(leader)) {
+ /*
+ * The group leader is hybrid event and it's
+ * only available on part of cpus. But the group
+ * member are available on all cpus. TODO:
+ * read the counts on the rest of cpus for group
+ * member.
+ */
+ WARN_ONCE(1, "WARNING: for %s, some CPU counts "
+ "not read\n", leader->name);
+ return 0;
+ }
return -EINVAL;
+ }
+
+ /*
+ * For example the leader is a software event and it's available on
+ * cpu0-cpu1, but the group member is a hybrid event and it's only
+ * available on cpu1. For cpu0, we have only one event, but for cpu1
+ * we have two events. So we need to change the read size according to
+ * the real number of events on a given cpu.
+ */
+ new_size = hybrid_read_size(leader, cpu, &nr_members);
+ if (new_size)
+ size = new_size;
+
+ if (ps->group_data && ps->group_data_size < size) {
+ zfree(&ps->group_data);
+ data = NULL;
+ }

if (!data) {
data = zalloc(size);
@@ -1506,6 +1578,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
return -ENOMEM;

ps->group_data = data;
+ ps->group_data_size = size;
}

if (FD(leader, cpu, thread) < 0)
@@ -1514,7 +1587,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
if (readn(FD(leader, cpu, thread), data, size) <= 0)
return -errno;

- return evsel__process_group_data(leader, cpu, thread, data);
+ return evsel__process_group_data(leader, cpu, thread, data, nr_members);
}

int evsel__read_counter(struct evsel *evsel, int cpu, int thread)
@@ -1561,6 +1634,28 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
*/
BUG_ON(!leader->core.fd);

+ /*
+ * If leader is not hybrid event, it's available on
+ * all cpus (e.g. software event). But hybrid evsel
+ * member is only available on part of cpus. So need
+ * to get the leader's fd from correct cpu.
+ */
+ if (evsel__is_hybrid_event(evsel) &&
+ !evsel__is_hybrid_event(leader)) {
+ cpu = evsel_cpuid_match(evsel, leader, cpu);
+ BUG_ON(cpu == -1);
+ }
+
+ /*
+ * Leader is hybrid event but member is global event.
+ */
+ if (!evsel__is_hybrid_event(evsel) &&
+ evsel__is_hybrid_event(leader)) {
+ cpu = evsel_cpuid_match(evsel, leader, cpu);
+ if (cpu == -1)
+ return -1;
+ }
+
fd = FD(leader, cpu, thread);
BUG_ON(fd == -1);

diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index 80f6715..b96168c 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -46,6 +46,7 @@ struct perf_stat_evsel {
struct stats res_stats[3];
enum perf_stat_evsel_id id;
u64 *group_data;
+ int group_data_size;
};

enum aggr_mode {
--
2.7.4

2021-02-08 19:16:41

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

On Mon, Feb 08, 2021 at 07:24:59AM -0800, [email protected] wrote:
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index c20a52b..1f25ac9 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -139,6 +139,16 @@ struct cpuinfo_x86 {
> u32 microcode;
> /* Address space bits used by the cache internally */
> u8 x86_cache_bits;
> + /*
> + * In hybrid processors, there is a CPU type and a native model ID. The
> + * CPU type (x86_cpu_type[31:24]) describes the type of micro-
> + * architecture families. The native model ID (x86_cpu_type[23:0])
> + * describes a specific microarchitecture version. Combining both
> + * allows to uniquely identify a CPU.
> + *
> + * Please note that the native model ID is not related to x86_model.
> + */
> + u32 x86_cpu_type;

Why are you adding it here instead of simply using
X86_FEATURE_HYBRID_CPU at the call site?

How many uses in this patchset?

/me searches...

Exactly one.

Just query X86_FEATURE_HYBRID_CPU at the call site and read what you
need from CPUID and use it there - no need for this.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-02-08 20:21:30

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 28/49] perf pmu: Save detected hybrid pmus to a global pmu list

Em Mon, Feb 08, 2021 at 07:25:25AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> We identify the cpu_core pmu and cpu_atom pmu by explicitly
> checking following files:
>
> For cpu_core, check:
> "/sys/bus/event_source/devices/cpu_core/cpus"
>
> For cpu_atom, check:
> "/sys/bus/event_source/devices/cpu_atom/cpus"
>
> If the 'cpus' file exists, the pmu exists.
>
> But in order not to hardcode the "cpu_core" and "cpu_atom",
> and make the code generic, if the path
> "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid
> pmu exists. All the detected hybrid pmus are linked to a
> global list 'perf_pmu__hybrid_pmus' and then next we just need
> to iterate the list by using perf_pmu__for_each_hybrid_pmus.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/pmu.c | 21 +++++++++++++++++++++
> tools/perf/util/pmu.h | 7 +++++++
> 2 files changed, 28 insertions(+)
>
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 0c25457..e97b121 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -27,6 +27,7 @@
> #include "fncache.h"
>
> struct perf_pmu perf_pmu__fake;
> +LIST_HEAD(perf_pmu__hybrid_pmus);
>
> struct perf_pmu_format {
> char *name;
> @@ -633,11 +634,27 @@ static struct perf_cpu_map *pmu_cpumask(const char *name)
> return NULL;
> }
>
> +static bool pmu_is_hybrid(const char *name)
> +{
> + char path[PATH_MAX];
> + const char *sysfs;
> +
> + if (strncmp(name, "cpu_", 4))
> + return false;
> +
> + sysfs = sysfs__mountpoint();

Its extremely unlikely that sysfs isn't mounted, but if so, this will
NULL deref, so please do as other sysfs__mountpoint() uses in
tools/perf/util/pmu.c and check if sysfs is NULL, returning false, i.e.
file isn't available.

> + snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, name);
> + return file_available(path);
> +}
> +
> static bool pmu_is_uncore(const char *name)
> {
> char path[PATH_MAX];
> const char *sysfs;
>
> + if (pmu_is_hybrid(name))
> + return false;
> +
> sysfs = sysfs__mountpoint();
> snprintf(path, PATH_MAX, CPUS_TEMPLATE_UNCORE, sysfs, name);
> return file_available(path);
> @@ -951,6 +968,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
> pmu->is_uncore = pmu_is_uncore(name);
> if (pmu->is_uncore)
> pmu->id = pmu_id(name);
> + pmu->is_hybrid = pmu_is_hybrid(name);
> pmu->max_precise = pmu_max_precise(name);
> pmu_add_cpu_aliases(&aliases, pmu);
> pmu_add_sys_aliases(&aliases, pmu);
> @@ -962,6 +980,9 @@ static struct perf_pmu *pmu_lookup(const char *name)
> list_splice(&aliases, &pmu->aliases);
> list_add_tail(&pmu->list, &pmus);
>
> + if (pmu->is_hybrid)
> + list_add_tail(&pmu->hybrid_list, &perf_pmu__hybrid_pmus);
> +
> pmu->default_config = perf_pmu__get_default_config(pmu);
>
> return pmu;
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 0e724d5..99bdb5d 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -5,6 +5,7 @@
> #include <linux/bitmap.h>
> #include <linux/compiler.h>
> #include <linux/perf_event.h>
> +#include <linux/list.h>
> #include <stdbool.h>
> #include "parse-events.h"
> #include "pmu-events/pmu-events.h"
> @@ -34,6 +35,7 @@ struct perf_pmu {
> __u32 type;
> bool selectable;
> bool is_uncore;
> + bool is_hybrid;
> bool auxtrace;
> int max_precise;
> struct perf_event_attr *default_config;
> @@ -42,9 +44,11 @@ struct perf_pmu {
> struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
> struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
> struct list_head list; /* ELEM */
> + struct list_head hybrid_list;
> };
>
> extern struct perf_pmu perf_pmu__fake;
> +extern struct list_head perf_pmu__hybrid_pmus;
>
> struct perf_pmu_info {
> const char *unit;
> @@ -124,4 +128,7 @@ int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
>
> int perf_pmu__caps_parse(struct perf_pmu *pmu);
>
> +#define perf_pmu__for_each_hybrid_pmus(pmu) \

singular, i.e.

#define perf_pmu__for_each_hybrid_pmu(pmu) \

> + list_for_each_entry(pmu, &perf_pmu__hybrid_pmus, hybrid_list)
> +
> #endif /* __PMU_H */
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:25:29

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 35/49] perf parse-events: Create two hybrid hardware events

Em Mon, Feb 08, 2021 at 07:25:32AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> For hardware events, they have pre-defined configs. The kernel
> needs to know where the event comes from (e.g. from cpu_core pmu
> or from cpu_atom pmu). But the perf type 'PERF_TYPE_HARDWARE'
> can't carry pmu information.
>
> So the kernel introduces a new type 'PERF_TYPE_HARDWARE_PMU'.
> The new attr.config layout for PERF_TYPE_HARDWARE_PMU is:
>
> 0xDD000000AA
> AA: original hardware event ID
> DD: PMU type ID
>
> PMU type ID is retrieved from sysfs. For example,
>
> cat /sys/devices/cpu_atom/type
> 10
>
> cat /sys/devices/cpu_core/type
> 4
>
> When enabling a hybrid hardware event without specified pmu, such as,
> 'perf stat -e cycles -a', two events are created automatically. One
> is for atom, the other is for core.

please move the command output two chars to the right, otherwise lines
with --- may confuse some scripts.

> root@otcpl-adl-s-2:~# ./perf stat -e cycles -vv -a -- sleep 1
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0x400000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
> cycles: 0: 1254337 1001292571 1001292571
> cycles: 1: 2595141 1001279813 1001279813
> cycles: 2: 134853 1001276406 1001276406
> cycles: 3: 81119 1001271089 1001271089
> cycles: 4: 251353 1001264678 1001264678
> cycles: 5: 415593 1001259163 1001259163
> cycles: 6: 129643 1001265312 1001265312
> cycles: 7: 80289 1001258979 1001258979
> cycles: 8: 169983 1001251207 1001251207
> cycles: 9: 81981 1001245487 1001245487
> cycles: 10: 4116221 1001245537 1001245537
> cycles: 11: 85531 1001253097 1001253097
> cycles: 12: 3969132 1001254270 1001254270
> cycles: 13: 96006 1001254691 1001254691
> cycles: 14: 385004 1001244971 1001244971
> cycles: 15: 394446 1001251437 1001251437
> cycles: 0: 427330 1001253457 1001253457
> cycles: 1: 444043 1001255914 1001255914
> cycles: 2: 97285 1001253555 1001253555
> cycles: 3: 92071 1001260556 1001260556
> cycles: 4: 86292 1001249896 1001249896
> cycles: 5: 236851 1001238979 1001238979
> cycles: 6: 100081 1001239792 1001239792
> cycles: 7: 72836 1001243276 1001243276
> cycles: 14240632 16020168708 16020168708
> cycles: 1556789 8009995425 8009995425
>
> Performance counter stats for 'system wide':
>
> 14,240,632 cycles
> 1,556,789 cycles
>
> 1.002261231 seconds time elapsed
>
> type 6 is PERF_TYPE_HARDWARE_PMU.
> 0x4 in 0x400000000 indicates the cpu_core pmu.
> 0xa in 0xa00000000 indicates the cpu_atom pmu.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/parse-events.c | 73 ++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 73 insertions(+)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 81a6fce..1e767dc 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -446,6 +446,24 @@ static int config_attr(struct perf_event_attr *attr,
> struct parse_events_error *err,
> config_term_func_t config_term);
>
> +static void config_hybrid_attr(struct perf_event_attr *attr,
> + int type, int pmu_type)
> +{
> + /*
> + * attr.config layout:
> + * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
> + * AA: hardware event ID
> + * DD: PMU type ID
> + * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
> + * AA: hardware cache ID
> + * BB: hardware cache op ID
> + * CC: hardware cache op result ID
> + * DD: PMU type ID
> + */
> + attr->type = type;
> + attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
> +}
> +
> int parse_events_add_cache(struct list_head *list, int *idx,
> char *type, char *op_result1, char *op_result2,
> struct parse_events_error *err,
> @@ -1409,6 +1427,47 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
> err, head_config);
> }
>
> +static int create_hybrid_hw_event(struct parse_events_state *parse_state,
> + struct list_head *list,
> + struct perf_event_attr *attr,
> + struct perf_pmu *pmu)
> +{
> + struct evsel *evsel;
> + __u32 type = attr->type;
> + __u64 config = attr->config;
> +
> + config_hybrid_attr(attr, PERF_TYPE_HARDWARE_PMU, pmu->type);
> + evsel = __add_event(list, &parse_state->idx, attr, true, NULL,
> + pmu, NULL, false, NULL);
> + if (evsel)
> + evsel->pmu_name = strdup(pmu->name);
> + else
> + return -ENOMEM;
> +
> + attr->type = type;
> + attr->config = config;
> + return 0;
> +}
> +
> +static int add_hybrid_numeric(struct parse_events_state *parse_state,
> + struct list_head *list,
> + struct perf_event_attr *attr,
> + bool *hybrid)
> +{
> + struct perf_pmu *pmu;
> + int ret;
> +
> + *hybrid = false;
> + perf_pmu__for_each_hybrid_pmus(pmu) {
> + *hybrid = true;
> + ret = create_hybrid_hw_event(parse_state, list, attr, pmu);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> int parse_events_add_numeric(struct parse_events_state *parse_state,
> struct list_head *list,
> u32 type, u64 config,
> @@ -1416,6 +1475,8 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
> {
> struct perf_event_attr attr;
> LIST_HEAD(config_terms);
> + bool hybrid;
> + int ret;
>
> memset(&attr, 0, sizeof(attr));
> attr.type = type;
> @@ -1430,6 +1491,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
> return -ENOMEM;
> }
>
> + /*
> + * Skip the software dummy event.
> + */
> + if (type != PERF_TYPE_SOFTWARE) {
> + if (!perf_pmu__hybrid_exist())
> + perf_pmu__scan(NULL);
> +
> + ret = add_hybrid_numeric(parse_state, list, &attr, &hybrid);
> + if (hybrid)
> + return ret;
> + }
> +
> return add_event(list, &parse_state->idx, &attr,
> get_config_name(head_config), &config_terms);
> }
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:25:49

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 27/49] perf util: Save pmu name to struct perf_pmu_alias

Em Mon, Feb 08, 2021 at 07:25:24AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> On hybrid platform, one event is available on one pmu
> (such as, cpu_core or cpu_atom).
>
> This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
> Then next we can know the pmu where the event can be enabled.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/pmu.c | 17 +++++++++++++----
> tools/perf/util/pmu.h | 1 +
> 2 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 44ef283..0c25457 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -283,6 +283,7 @@ void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
> zfree(&newalias->str);
> zfree(&newalias->metric_expr);
> zfree(&newalias->metric_name);
> + zfree(&newalias->pmu);
> parse_events_terms__purge(&newalias->terms);
> free(newalias);
> }
> @@ -297,6 +298,10 @@ static bool perf_pmu_merge_alias(struct perf_pmu_alias *newalias,
>
> list_for_each_entry(a, alist, list) {
> if (!strcasecmp(newalias->name, a->name)) {
> + if (newalias->pmu && a->pmu &&
> + !strcasecmp(newalias->pmu, a->pmu)) {
> + continue;
> + }
> perf_pmu_update_alias(a, newalias);
> perf_pmu_free_alias(newalias);
> return true;
> @@ -311,7 +316,8 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
> char *unit, char *perpkg,
> char *metric_expr,
> char *metric_name,
> - char *deprecated)
> + char *deprecated,
> + char *pmu)
> {
> struct parse_events_term *term;
> struct perf_pmu_alias *alias;
> @@ -382,6 +388,7 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
> }
> alias->per_pkg = perpkg && sscanf(perpkg, "%d", &num) == 1 && num == 1;
> alias->str = strdup(newval);
> + alias->pmu = pmu ? strdup(pmu) : NULL;
>
> if (deprecated)
> alias->deprecated = true;
> @@ -407,7 +414,7 @@ static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FI
> strim(buf);
>
> return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL, NULL,
> - NULL, NULL, NULL, NULL);
> + NULL, NULL, NULL, NULL, NULL);
> }
>
> static inline bool pmu_alias_info_file(char *name)
> @@ -797,7 +804,8 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
> (char *)pe->unit, (char *)pe->perpkg,
> (char *)pe->metric_expr,
> (char *)pe->metric_name,
> - (char *)pe->deprecated);
> + (char *)pe->deprecated,
> + (char *)pe->pmu);
> }
> }
>
> @@ -870,7 +878,8 @@ static int pmu_add_sys_aliases_iter_fn(struct pmu_event *pe, void *data)
> (char *)pe->perpkg,
> (char *)pe->metric_expr,
> (char *)pe->metric_name,
> - (char *)pe->deprecated);
> + (char *)pe->deprecated,
> + NULL);

At some point I think passing the whole 'struct pme_event' pointer
should be better?

> }
>
> return 0;
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 8164388..0e724d5 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -72,6 +72,7 @@ struct perf_pmu_alias {
> bool deprecated;
> char *metric_expr;
> char *metric_name;
> + char *pmu;
> };
>
> struct perf_pmu *perf_pmu__find(const char *name);
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:30:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 39/49] perf parse-events: Support hybrid raw events

Em Mon, Feb 08, 2021 at 07:25:36AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> On hybrid platform, same raw event is possible to be available on
> both cpu_core pmu and cpu_atom pmu. So it's supported to create
> two raw events for one event encoding.
>
> root@otcpl-adl-s-2:~# ./perf stat -e r3c -a -vv -- sleep 1
> Control descriptor is not initialized
> ------------------------------------------------------------

please move thie command outout two chars to the right

> perf_event_attr:
> type 4
> size 120
> config 0x3c
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
> ------------------------------------------------------------
> perf_event_attr:
> type 10
> size 120
> config 0x3c
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
> ...
>
> Performance counter stats for 'system wide':
>
> 13,107,070 r3c
> 316,562 r3c
>
> 1.002161379 seconds time elapsed
>
> It also supports the raw event inside pmu. Syntax is similar:
>
> cpu_core/<raw event>/
> cpu_atom/<raw event>/
>
> root@otcpl-adl-s-2:~# ./perf stat -e cpu_core/r3c/ -vv -- ./triad_loop
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> type 4
> size 120
> config 0x3c
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 23641 cpu -1 group_fd -1 flags 0x8 = 3
> cpu_core/r3c/: 0: 401407363 102724005 102724005
> cpu_core/r3c/: 401407363 102724005 102724005
>
> Performance counter stats for './triad_loop':
>
> 401,407,363 cpu_core/r3c/
>
> 0.103186241 seconds time elapsed
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 55 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index ddf6f79..6d7a2ce 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1532,6 +1532,55 @@ static int add_hybrid_numeric(struct parse_events_state *parse_state,
> return 0;
> }
>
> +static int create_hybrid_raw_event(struct parse_events_state *parse_state,
> + struct list_head *list,
> + struct perf_event_attr *attr,
> + struct list_head *head_config,
> + struct list_head *config_terms,
> + struct perf_pmu *pmu)
> +{
> + struct evsel *evsel;
> +
> + attr->type = pmu->type;
> + evsel = __add_event(list, &parse_state->idx, attr, true,
> + get_config_name(head_config),
> + pmu, config_terms, false, NULL);
> + if (evsel)
> + evsel->pmu_name = strdup(pmu->name);
> + else
> + return -ENOMEM;
> +
> + return 0;
> +}
> +
> +static int add_hybrid_raw(struct parse_events_state *parse_state,
> + struct list_head *list,
> + struct perf_event_attr *attr,
> + struct list_head *head_config,
> + struct list_head *config_terms,
> + bool *hybrid)
> +{
> + struct perf_pmu *pmu;
> + int ret;
> +
> + *hybrid = false;
> + perf_pmu__for_each_hybrid_pmus(pmu) {
> + *hybrid = true;
> + if (parse_state->pmu_name &&
> + strcmp(parse_state->pmu_name, pmu->name)) {
> + continue;
> + }
> +
> + ret = create_hybrid_raw_event(parse_state, list, attr,
> + head_config, config_terms,
> + pmu);
> + if (ret)
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> int parse_events_add_numeric(struct parse_events_state *parse_state,
> struct list_head *list,
> u32 type, u64 config,
> @@ -1558,7 +1607,12 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
> /*
> * Skip the software dummy event.
> */
> - if (type != PERF_TYPE_SOFTWARE) {
> + if (type == PERF_TYPE_RAW) {
> + ret = add_hybrid_raw(parse_state, list, &attr, head_config,
> + &config_terms, &hybrid);
> + if (hybrid)
> + return ret;
> + } else if (type != PERF_TYPE_SOFTWARE) {
> if (!perf_pmu__hybrid_exist())
> perf_pmu__scan(NULL);
>
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:31:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 32/49] perf header: Support HYBRID_TOPOLOGY feature

Em Mon, Feb 08, 2021 at 07:25:29AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> It would be useful to let user know the hybrid topology.
> For example, the HYBRID_TOPOLOGY feature in header indicates which
> cpus are core cpus, and which cpus are atom cpus.

Can you please update tools/perf/Documentation/perf.data-file-format.txt
?

> With this patch,

> On a hybrid platform:
>
> root@otcpl-adl-s-2:~# ./perf report --header-only -I
> ...
> # cpu_core cpu list : 0-15
> # cpu_atom cpu list : 16-23
>
> On a non-hybrid platform:
>
> root@kbl-ppc:~# ./perf report --header-only -I
> ...
> # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY
>
> It just shows HYBRID_TOPOLOGY is missing feature.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/cputopo.c | 80 +++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/cputopo.h | 13 +++++++
> tools/perf/util/env.c | 6 ++++
> tools/perf/util/env.h | 7 ++++
> tools/perf/util/header.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/header.h | 1 +
> tools/perf/util/pmu.c | 1 -
> tools/perf/util/pmu.h | 1 +
> 8 files changed, 200 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
> index 1b52402..4a00fb8 100644
> --- a/tools/perf/util/cputopo.c
> +++ b/tools/perf/util/cputopo.c
> @@ -12,6 +12,7 @@
> #include "cpumap.h"
> #include "debug.h"
> #include "env.h"
> +#include "pmu.h"
>
> #define CORE_SIB_FMT \
> "%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
> @@ -351,3 +352,82 @@ void numa_topology__delete(struct numa_topology *tp)
>
> free(tp);
> }
> +
> +static int load_hybrid_node(struct hybrid_topology_node *node,
> + struct perf_pmu *pmu)
> +{
> + const char *sysfs;
> + char path[PATH_MAX];
> + char *buf = NULL, *p;
> + FILE *fp;
> + size_t len = 0;
> +
> + node->pmu_name = strdup(pmu->name);
> + if (!node->pmu_name)
> + return -1;
> +
> + sysfs = sysfs__mountpoint();

Check for NULL

> + snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, pmu->name);
> +
> + fp = fopen(path, "r");
> + if (!fp)
> + goto err;
> +
> + if (getline(&buf, &len, fp) <= 0) {
> + fclose(fp);
> + goto err;
> + }
> +
> + p = strchr(buf, '\n');
> + if (p)
> + *p = '\0';
> +
> + fclose(fp);
> + node->cpus = buf;
> + return 0;
> +
> +err:
> + zfree(&node->pmu_name);
> + free(buf);
> + return -1;
> +}
> +
> +struct hybrid_topology *hybrid_topology__new(void)
> +{
> + struct perf_pmu *pmu;
> + struct hybrid_topology *tp = NULL;
> + u32 nr = 0, i = 0;
> +
> + perf_pmu__for_each_hybrid_pmus(pmu)
> + nr++;
> +
> + if (nr == 0)
> + return NULL;
> +
> + tp = zalloc(sizeof(*tp) + sizeof(tp->nodes[0]) * nr);
> + if (!tp)
> + return NULL;
> +
> + tp->nr = nr;
> + perf_pmu__for_each_hybrid_pmus(pmu) {
> + if (load_hybrid_node(&tp->nodes[i], pmu)) {
> + hybrid_topology__delete(tp);
> + return NULL;
> + }
> + i++;
> + }
> +
> + return tp;
> +}
> +
> +void hybrid_topology__delete(struct hybrid_topology *tp)
> +{
> + u32 i;
> +
> + for (i = 0; i < tp->nr; i++) {
> + zfree(&tp->nodes[i].pmu_name);
> + zfree(&tp->nodes[i].cpus);
> + }
> +
> + free(tp);
> +}
> diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
> index 6201c37..d9af971 100644
> --- a/tools/perf/util/cputopo.h
> +++ b/tools/perf/util/cputopo.h
> @@ -25,10 +25,23 @@ struct numa_topology {
> struct numa_topology_node nodes[];
> };
>
> +struct hybrid_topology_node {
> + char *pmu_name;
> + char *cpus;
> +};
> +
> +struct hybrid_topology {
> + u32 nr;
> + struct hybrid_topology_node nodes[];
> +};
> +
> struct cpu_topology *cpu_topology__new(void);
> void cpu_topology__delete(struct cpu_topology *tp);
>
> struct numa_topology *numa_topology__new(void);
> void numa_topology__delete(struct numa_topology *tp);
>
> +struct hybrid_topology *hybrid_topology__new(void);
> +void hybrid_topology__delete(struct hybrid_topology *tp);
> +
> #endif /* __PERF_CPUTOPO_H */
> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
> index 9130f6f..9e05eca 100644
> --- a/tools/perf/util/env.c
> +++ b/tools/perf/util/env.c
> @@ -202,6 +202,12 @@ void perf_env__exit(struct perf_env *env)
> for (i = 0; i < env->nr_memory_nodes; i++)
> zfree(&env->memory_nodes[i].set);
> zfree(&env->memory_nodes);
> +
> + for (i = 0; i < env->nr_hybrid_nodes; i++) {
> + perf_cpu_map__put(env->hybrid_nodes[i].map);
> + zfree(&env->hybrid_nodes[i].pmu_name);
> + }
> + zfree(&env->hybrid_nodes);
> }
>
> void perf_env__init(struct perf_env *env __maybe_unused)
> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index ca249bf..9ca7633 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -37,6 +37,11 @@ struct memory_node {
> unsigned long *set;
> };
>
> +struct hybrid_node {
> + char *pmu_name;
> + struct perf_cpu_map *map;
> +};
> +
> struct perf_env {
> char *hostname;
> char *os_release;
> @@ -59,6 +64,7 @@ struct perf_env {
> int nr_pmu_mappings;
> int nr_groups;
> int nr_cpu_pmu_caps;
> + int nr_hybrid_nodes;
> char *cmdline;
> const char **cmdline_argv;
> char *sibling_cores;
> @@ -77,6 +83,7 @@ struct perf_env {
> struct numa_node *numa_nodes;
> struct memory_node *memory_nodes;
> unsigned long long memory_bsize;
> + struct hybrid_node *hybrid_nodes;
> #ifdef HAVE_LIBBPF_SUPPORT
> /*
> * bpf_info_lock protects bpf rbtrees. This is needed because the
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index c4ed3dc..6bcd959 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -932,6 +932,40 @@ static int write_clock_data(struct feat_fd *ff,
> return do_write(ff, data64, sizeof(*data64));
> }
>
> +static int write_hybrid_topology(struct feat_fd *ff,
> + struct evlist *evlist __maybe_unused)
> +{
> + struct hybrid_topology *tp;
> + int ret;
> + u32 i;
> +
> + tp = hybrid_topology__new();
> + if (!tp)
> + return -1;
> +
> + ret = do_write(ff, &tp->nr, sizeof(u32));
> + if (ret < 0)
> + goto err;
> +
> + for (i = 0; i < tp->nr; i++) {
> + struct hybrid_topology_node *n = &tp->nodes[i];
> +
> + ret = do_write_string(ff, n->pmu_name);
> + if (ret < 0)
> + goto err;
> +
> + ret = do_write_string(ff, n->cpus);
> + if (ret < 0)
> + goto err;
> + }
> +
> + ret = 0;
> +
> +err:
> + hybrid_topology__delete(tp);
> + return ret;
> +}
> +
> static int write_dir_format(struct feat_fd *ff,
> struct evlist *evlist __maybe_unused)
> {
> @@ -1623,6 +1657,19 @@ static void print_clock_data(struct feat_fd *ff, FILE *fp)
> clockid_name(clockid));
> }
>
> +static void print_hybrid_topology(struct feat_fd *ff, FILE *fp)
> +{
> + int i;
> + struct hybrid_node *n;
> +
> + for (i = 0; i < ff->ph->env.nr_hybrid_nodes; i++) {
> + n = &ff->ph->env.hybrid_nodes[i];
> +
> + fprintf(fp, "# %s cpu list : ", n->pmu_name);
> + cpu_map__fprintf(n->map, fp);
> + }
> +}
> +
> static void print_dir_format(struct feat_fd *ff, FILE *fp)
> {
> struct perf_session *session;
> @@ -2849,6 +2896,50 @@ static int process_clock_data(struct feat_fd *ff,
> return 0;
> }
>
> +static int process_hybrid_topology(struct feat_fd *ff,
> + void *data __maybe_unused)
> +{
> + struct hybrid_node *nodes, *n;
> + u32 nr, i;
> + char *str;
> +
> + /* nr nodes */
> + if (do_read_u32(ff, &nr))
> + return -1;
> +
> + nodes = zalloc(sizeof(*nodes) * nr);
> + if (!nodes)
> + return -ENOMEM;
> +
> + for (i = 0; i < nr; i++) {
> + n = &nodes[i];
> +
> + n->pmu_name = do_read_string(ff);
> + if (!n->pmu_name)
> + goto error;
> +
> + str = do_read_string(ff);
> + if (!str)
> + goto error;
> +
> + n->map = perf_cpu_map__new(str);
> + free(str);
> + if (!n->map)
> + goto error;
> + }
> +
> + ff->ph->env.nr_hybrid_nodes = nr;
> + ff->ph->env.hybrid_nodes = nodes;
> + return 0;
> +
> +error:
> + for (i = 0; i < nr; i++)
> + free(nodes[i].pmu_name);
> +
> + free(nodes);
> + return -1;
> +}
> +
> static int process_dir_format(struct feat_fd *ff,
> void *_data __maybe_unused)
> {
> @@ -3117,6 +3208,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
> FEAT_OPR(COMPRESSED, compressed, false),
> FEAT_OPR(CPU_PMU_CAPS, cpu_pmu_caps, false),
> FEAT_OPR(CLOCK_DATA, clock_data, false),
> + FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
> };
>
> struct header_print_data {
> diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
> index 2aca717..3f12ec0 100644
> --- a/tools/perf/util/header.h
> +++ b/tools/perf/util/header.h
> @@ -45,6 +45,7 @@ enum {
> HEADER_COMPRESSED,
> HEADER_CPU_PMU_CAPS,
> HEADER_CLOCK_DATA,
> + HEADER_HYBRID_TOPOLOGY,
> HEADER_LAST_FEATURE,
> HEADER_FEAT_BITS = 256,
> };
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index 9a6c973..ca2fc67 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -607,7 +607,6 @@ static struct perf_cpu_map *__pmu_cpumask(const char *path)
> */
> #define SYS_TEMPLATE_ID "./bus/event_source/devices/%s/identifier"
> #define CPUS_TEMPLATE_UNCORE "%s/bus/event_source/devices/%s/cpumask"
> -#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"
>
> static struct perf_cpu_map *pmu_cpumask(const char *name)
> {
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 5b727cf..ccffc05 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -20,6 +20,7 @@ enum {
>
> #define PERF_PMU_FORMAT_BITS 64
> #define EVENT_SOURCE_DEVICE_PATH "/bus/event_source/devices/"
> +#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"
>
> struct perf_event_attr;
>
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:31:50

by Luck, Tony

[permalink] [raw]
Subject: Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

On Mon, Feb 08, 2021 at 02:04:24PM -0500, Liang, Kan wrote:
> On 2/8/2021 12:56 PM, Borislav Petkov wrote:
>
> I think it's good enough for perf, but I'm not sure whether other codes need
> the CPU type information.
>
> Ricardo, do you know?
>
> Maybe we should implement a generic function as below for this?
> (Not test. Just use as an example.)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index a66c1fd..679f5fe 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2056,3 +2056,11 @@ void arch_smt_update(void)
> /* Check whether IPI broadcasting can be enabled */
> apic_smt_update();
> }
> +
> +u32 x86_read_hybrid_type(void)
> +{
> + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU))
> + return cpuid_eax(0x0000001a);
> +
> + return 0;
> +}

Machine check logging will want to include this in "struct mce".

But ok to pick it up with a function like you describe above.

-Tony

2021-02-08 20:32:34

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 49/49] perf evsel: Adjust hybrid event and global event mixed group

Em Mon, Feb 08, 2021 at 07:25:46AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> A group mixed with hybrid event and global event is allowed. For example,
> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
>
> e.g.
> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
>
> The challenge is their available cpus are not fully matched.
> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
> is available on CPU16-CPU23.
>
> When getting the group id for group member, we must be very careful
> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
>
> Another challenge is for group read. The events in group may be not
> available on all cpus. For example the leader is a software event and
> it's available on CPU0-CPU1, but the group member is a hybrid event and
> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
> we have two events. So we need to change the read size according to
> the real number of events on that cpu.
>
> Let's see examples,
>
> root@otcpl-adl-s-2:~# ./perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a -vvv -- sleep 1
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 16 group_fd 20 flags 0x8 = 28
> sys_perf_event_open: pid -1 cpu 17 group_fd 21 flags 0x8 = 29
> sys_perf_event_open: pid -1 cpu 18 group_fd 22 flags 0x8 = 30
> sys_perf_event_open: pid -1 cpu 19 group_fd 23 flags 0x8 = 31
> sys_perf_event_open: pid -1 cpu 20 group_fd 24 flags 0x8 = 32
> sys_perf_event_open: pid -1 cpu 21 group_fd 25 flags 0x8 = 33
> sys_perf_event_open: pid -1 cpu 22 group_fd 26 flags 0x8 = 34
> sys_perf_event_open: pid -1 cpu 23 group_fd 27 flags 0x8 = 35
> cpu-clock: 0: 1001661765 1001663044 1001663044
> cpu-clock: 1: 1001659407 1001659885 1001659885
> cpu-clock: 2: 1001646087 1001647302 1001647302
> cpu-clock: 3: 1001645168 1001645550 1001645550
> cpu-clock: 4: 1001645052 1001646102 1001646102
> cpu-clock: 5: 1001643719 1001644472 1001644472
> cpu-clock: 6: 1001641893 1001642859 1001642859
> cpu-clock: 7: 1001640524 1001641036 1001641036
> cpu-clock: 8: 1001637596 1001638076 1001638076
> cpu-clock: 9: 1001638121 1001638200 1001638200
> cpu-clock: 10: 1001635825 1001636915 1001636915
> cpu-clock: 11: 1001633722 1001634276 1001634276
> cpu-clock: 12: 1001687133 1001686941 1001686941
> cpu-clock: 13: 1001693663 1001693317 1001693317
> cpu-clock: 14: 1001693381 1001694407 1001694407
> cpu-clock: 15: 1001691865 1001692321 1001692321
> cpu-clock: 16: 1001696621 1001696550 1001696550
> cpu-clock: 17: 1001699963 1001699822 1001699822
> cpu-clock: 18: 1001701938 1001701850 1001701850
> cpu-clock: 19: 1001699298 1001699214 1001699214
> cpu-clock: 20: 1001691550 1001691026 1001691026
> cpu-clock: 21: 1001688348 1001688212 1001688212
> cpu-clock: 22: 1001684907 1001684799 1001684799
> cpu-clock: 23: 1001680840 1001680780 1001680780
> cycles: 0: 28175 1001696550 1001696550
> cycles: 1: 403323 1001699822 1001699822
> cycles: 2: 35905 1001701850 1001701850
> cycles: 3: 36755 1001699214 1001699214
> cycles: 4: 33757 1001691026 1001691026
> cycles: 5: 37146 1001688212 1001688212
> cycles: 6: 35483 1001684799 1001684799
> cycles: 7: 38600 1001680780 1001680780
> cpu-clock: 24040038386 24040046956 24040046956
> cycles: 649144 8013542253 8013542253
>
> Performance counter stats for 'system wide':
>
> 24,040.04 msec cpu-clock # 23.976 CPUs utilized
> 649,144 cycles [cpu_atom] # 0.027 M/sec
>
> 1.002683706 seconds time elapsed
>
> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
>
> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
> there leaves more works to do.
>
> root@otcpl-adl-s-2:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
> Control descriptor is not initialized
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
> disabled 1
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 3
> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 4
> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 5
> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 7
> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 8
> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 9
> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 10
> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 11
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
> inherit 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 16
> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 17
> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 18
> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 19
> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 20
> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 21
> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 22
> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 23
> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 24
> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 25
> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 26
> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 27
> sys_perf_event_open: pid -1 cpu 16 group_fd 3 flags 0x8 = 28
> sys_perf_event_open: pid -1 cpu 17 group_fd 4 flags 0x8 = 29
> sys_perf_event_open: pid -1 cpu 18 group_fd 5 flags 0x8 = 30
> sys_perf_event_open: pid -1 cpu 19 group_fd 7 flags 0x8 = 31
> sys_perf_event_open: pid -1 cpu 20 group_fd 8 flags 0x8 = 32
> sys_perf_event_open: pid -1 cpu 21 group_fd 9 flags 0x8 = 33
> sys_perf_event_open: pid -1 cpu 22 group_fd 10 flags 0x8 = 34
> sys_perf_event_open: pid -1 cpu 23 group_fd 11 flags 0x8 = 35
> cycles: 0: 422260 1001993637 1001993637
> cycles: 1: 631309 1002039934 1002039934
> cycles: 2: 309501 1002018065 1002018065
> cycles: 3: 119279 1002040811 1002040811
> cycles: 4: 89389 1002039312 1002039312
> cycles: 5: 155437 1002054794 1002054794
> cycles: 6: 92420 1002051141 1002051141
> cycles: 7: 96017 1002073659 1002073659
> cpu-clock: 0: 0 0 0
> cpu-clock: 1: 0 0 0
> cpu-clock: 2: 0 0 0
> cpu-clock: 3: 0 0 0
> cpu-clock: 4: 0 0 0
> cpu-clock: 5: 0 0 0
> cpu-clock: 6: 0 0 0
> cpu-clock: 7: 0 0 0
> cpu-clock: 8: 0 0 0
> cpu-clock: 9: 0 0 0
> cpu-clock: 10: 0 0 0
> cpu-clock: 11: 0 0 0
> cpu-clock: 12: 0 0 0
> cpu-clock: 13: 0 0 0
> cpu-clock: 14: 0 0 0
> cpu-clock: 15: 0 0 0
> cpu-clock: 16: 1001997706 1001993637 1001993637
> cpu-clock: 17: 1002040524 1002039934 1002039934
> cpu-clock: 18: 1002018570 1002018065 1002018065
> cpu-clock: 19: 1002041360 1002040811 1002040811
> cpu-clock: 20: 1002044731 1002039312 1002039312
> cpu-clock: 21: 1002055355 1002054794 1002054794
> cpu-clock: 22: 1002051659 1002051141 1002051141
> cpu-clock: 23: 1002074150 1002073659 1002073659
> cycles: 1915612 8016311353 8016311353
> cpu-clock: 8016324055 8016311353 8016311353
>
> Performance counter stats for 'system wide':
>
> 1,915,612 cycles [cpu_atom] # 0.239 M/sec

I suggested having something like this in a previous patch, when
creating two 'instructions', etc events, one for cpu_atom and the other
for cpu_atom, perhaps even use with the PMU style, i.e.

1,915,612 cpu_atom/cycles/ # 0.239 M/sec

> 8,016.32 msec cpu-clock # 7.996 CPUs utilized
>
> 1.002545027 seconds time elapsed
>
> For cpu-clock, cpu16-cpu23 are set with valid group fd (cpu_atom/cycles/'s
> fd on that cpu). For counting results, cpu_atom/cycles/ has 8 cpus aggregation
> , that's correct. But for cpu-clock, it also has 8 cpus aggregation
> (cpu16-cpu23, not all cpus), the code should be improved. Now one warning
> is displayed: "WARNING: for cpu-clock, some CPU counts not read".
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/evsel.c | 105 +++++++++++++++++++++++++++++++++++++++++++++---
> tools/perf/util/stat.h | 1 +
> 2 files changed, 101 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 61508cf..65c8cfc8 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1453,15 +1453,26 @@ static void evsel__set_count(struct evsel *counter, int cpu, int thread, u64 val
> perf_counts__set_loaded(counter->counts, cpu, thread, true);
> }
>
> -static int evsel__process_group_data(struct evsel *leader, int cpu, int thread, u64 *data)
> +static int evsel_cpuid_match(struct evsel *evsel1, struct evsel *evsel2,
> + int cpu)
> +{
> + int cpuid;
> +
> + cpuid = perf_cpu_map__cpu(evsel1->core.cpus, cpu);
> + return perf_cpu_map__idx(evsel2->core.cpus, cpuid);
> +}
> +
> +static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
> + u64 *data, int nr_members)
> {
> u64 read_format = leader->core.attr.read_format;
> struct sample_read_value *v;
> u64 nr, ena = 0, run = 0, i;
> + int idx;
>
> nr = *data++;
>
> - if (nr != (u64) leader->core.nr_members)
> + if (nr != (u64) nr_members)
> return -EINVAL;
>
> if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
> @@ -1481,24 +1492,85 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
> if (!counter)
> return -EINVAL;
>
> - evsel__set_count(counter, cpu, thread, v[i].value, ena, run);
> + if (evsel__is_hybrid_event(counter) ||
> + evsel__is_hybrid_event(leader)) {
> + idx = evsel_cpuid_match(leader, counter, cpu);
> + if (idx == -1)
> + return -EINVAL;
> + } else
> + idx = cpu;
> +
> + evsel__set_count(counter, idx, thread, v[i].value, ena, run);
> }
>
> return 0;
> }
>
> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
> +{
> + struct evsel *pos;
> + int nr = 1, back, new_size = 0, idx;
> +
> + for_each_group_member(pos, leader) {
> + idx = evsel_cpuid_match(leader, pos, cpu);
> + if (idx != -1)
> + nr++;
> + }
> +
> + if (nr != leader->core.nr_members) {
> + back = leader->core.nr_members;
> + leader->core.nr_members = nr;
> + new_size = perf_evsel__read_size(&leader->core);
> + leader->core.nr_members = back;
> + }
> +
> + *nr_members = nr;
> + return new_size;
> +}
> +
> static int evsel__read_group(struct evsel *leader, int cpu, int thread)
> {
> struct perf_stat_evsel *ps = leader->stats;
> u64 read_format = leader->core.attr.read_format;
> int size = perf_evsel__read_size(&leader->core);
> + int new_size, nr_members;
> u64 *data = ps->group_data;
>
> if (!(read_format & PERF_FORMAT_ID))
> return -EINVAL;
>
> - if (!evsel__is_group_leader(leader))
> + if (!evsel__is_group_leader(leader)) {
> + if (evsel__is_hybrid_event(leader->leader) &&
> + !evsel__is_hybrid_event(leader)) {
> + /*
> + * The group leader is hybrid event and it's
> + * only available on part of cpus. But the group
> + * member are available on all cpus. TODO:
> + * read the counts on the rest of cpus for group
> + * member.
> + */
> + WARN_ONCE(1, "WARNING: for %s, some CPU counts "
> + "not read\n", leader->name);
> + return 0;
> + }
> return -EINVAL;
> + }
> +
> + /*
> + * For example the leader is a software event and it's available on
> + * cpu0-cpu1, but the group member is a hybrid event and it's only
> + * available on cpu1. For cpu0, we have only one event, but for cpu1
> + * we have two events. So we need to change the read size according to
> + * the real number of events on a given cpu.
> + */
> + new_size = hybrid_read_size(leader, cpu, &nr_members);
> + if (new_size)
> + size = new_size;
> +
> + if (ps->group_data && ps->group_data_size < size) {
> + zfree(&ps->group_data);
> + data = NULL;
> + }
>
> if (!data) {
> data = zalloc(size);
> @@ -1506,6 +1578,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
> return -ENOMEM;
>
> ps->group_data = data;
> + ps->group_data_size = size;
> }
>
> if (FD(leader, cpu, thread) < 0)
> @@ -1514,7 +1587,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
> if (readn(FD(leader, cpu, thread), data, size) <= 0)
> return -errno;
>
> - return evsel__process_group_data(leader, cpu, thread, data);
> + return evsel__process_group_data(leader, cpu, thread, data, nr_members);
> }
>
> int evsel__read_counter(struct evsel *evsel, int cpu, int thread)
> @@ -1561,6 +1634,28 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
> */
> BUG_ON(!leader->core.fd);
>
> + /*
> + * If leader is not hybrid event, it's available on
> + * all cpus (e.g. software event). But hybrid evsel
> + * member is only available on part of cpus. So need
> + * to get the leader's fd from correct cpu.
> + */
> + if (evsel__is_hybrid_event(evsel) &&
> + !evsel__is_hybrid_event(leader)) {
> + cpu = evsel_cpuid_match(evsel, leader, cpu);
> + BUG_ON(cpu == -1);
> + }
> +
> + /*
> + * Leader is hybrid event but member is global event.
> + */
> + if (!evsel__is_hybrid_event(evsel) &&
> + evsel__is_hybrid_event(leader)) {
> + cpu = evsel_cpuid_match(evsel, leader, cpu);
> + if (cpu == -1)
> + return -1;
> + }
> +
> fd = FD(leader, cpu, thread);
> BUG_ON(fd == -1);
>
> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
> index 80f6715..b96168c 100644
> --- a/tools/perf/util/stat.h
> +++ b/tools/perf/util/stat.h
> @@ -46,6 +46,7 @@ struct perf_stat_evsel {
> struct stats res_stats[3];
> enum perf_stat_evsel_id id;
> u64 *group_data;
> + int group_data_size;
> };
>
> enum aggr_mode {
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:33:17

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86



On 2/8/2021 12:56 PM, Borislav Petkov wrote:
> On Mon, Feb 08, 2021 at 07:24:59AM -0800, [email protected] wrote:
>> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
>> index c20a52b..1f25ac9 100644
>> --- a/arch/x86/include/asm/processor.h
>> +++ b/arch/x86/include/asm/processor.h
>> @@ -139,6 +139,16 @@ struct cpuinfo_x86 {
>> u32 microcode;
>> /* Address space bits used by the cache internally */
>> u8 x86_cache_bits;
>> + /*
>> + * In hybrid processors, there is a CPU type and a native model ID. The
>> + * CPU type (x86_cpu_type[31:24]) describes the type of micro-
>> + * architecture families. The native model ID (x86_cpu_type[23:0])
>> + * describes a specific microarchitecture version. Combining both
>> + * allows to uniquely identify a CPU.
>> + *
>> + * Please note that the native model ID is not related to x86_model.
>> + */
>> + u32 x86_cpu_type;
>
> Why are you adding it here instead of simply using
> X86_FEATURE_HYBRID_CPU at the call site?
>
> How many uses in this patchset?
>
> /me searches...
>
> Exactly one.
>
> Just query X86_FEATURE_HYBRID_CPU at the call site and read what you
> need from CPUID and use it there - no need for this.
>

I think it's good enough for perf, but I'm not sure whether other codes
need the CPU type information.

Ricardo, do you know?

Maybe we should implement a generic function as below for this?
(Not test. Just use as an example.)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a66c1fd..679f5fe 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2056,3 +2056,11 @@ void arch_smt_update(void)
/* Check whether IPI broadcasting can be enabled */
apic_smt_update();
}
+
+u32 x86_read_hybrid_type(void)
+{
+ if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU))
+ return cpuid_eax(0x0000001a);
+
+ return 0;
+}


Thanks,
Kan

2021-02-08 20:33:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 43/49] perf stat: Add default hybrid events

Em Mon, Feb 08, 2021 at 07:25:40AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> Previously if '-e' is not specified in perf stat, some software events
> and hardware events are added to evlist by default.
>
> root@otcpl-adl-s-2:~# ./perf stat -- ./triad_loop
>
> Performance counter stats for './triad_loop':
>
> 109.43 msec task-clock # 0.993 CPUs utilized
> 1 context-switches # 0.009 K/sec
> 0 cpu-migrations # 0.000 K/sec
> 105 page-faults # 0.960 K/sec
> 401,161,982 cycles # 3.666 GHz
> 1,601,216,357 instructions # 3.99 insn per cycle
> 200,217,751 branches # 1829.686 M/sec
> 14,555 branch-misses # 0.01% of all branches
>
> 0.110176860 seconds time elapsed
>
> Among the events, cycles, instructions, branches and branch-misses
> are hardware events.
>
> One hybrid platform, two events are created for one hardware event.
>
> core cycles,
> atom cycles,
> core instructions,
> atom instructions,
> core branches,
> atom branches,
> core branch-misses,
> atom branch-misses
>
> These events will be added to evlist in order on hybrid platform
> if '-e' is not set.
>
> Since parse_events() has been supported to create two hardware events
> for one event on hybrid platform, so we just use parse_events(evlist,
> "cycles,instructions,branches,branch-misses") to create the default
> events and add them to evlist.
>
> After:
> root@otcpl-adl-s-2:~# ./perf stat -vv -- taskset -c 16 ./triad_loop
> ...
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> config 0x1
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 3
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> config 0x3
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 4
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> config 0x4
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 5
> ------------------------------------------------------------
> perf_event_attr:
> type 1
> size 120
> config 0x2
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 7
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0x400000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 8
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000000
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 9
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0x400000001
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 10
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000001
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 11
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0x400000004
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 12
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000004
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 13
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0x400000005
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 14
> ------------------------------------------------------------
> perf_event_attr:
> type 6
> size 120
> config 0xa00000005
> sample_type IDENTIFIER
> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
> disabled 1
> inherit 1
> enable_on_exec 1
> exclude_guest 1
> ------------------------------------------------------------
> ...
>
> Performance counter stats for 'taskset -c 16 ./triad_loop':
>
> 201.31 msec task-clock # 0.997 CPUs utilized
> 1 context-switches # 0.005 K/sec
> 1 cpu-migrations # 0.005 K/sec
> 166 page-faults # 0.825 K/sec
> 623,267,134 cycles # 3096.043 M/sec (0.16%)
> 603,082,383 cycles # 2995.777 M/sec (99.84%)
> 406,410,481 instructions # 2018.820 M/sec (0.16%)
> 1,604,213,375 instructions # 7968.837 M/sec (99.84%)
> 81,444,171 branches # 404.569 M/sec (0.16%)
> 200,616,430 branches # 996.550 M/sec (99.84%)
> 3,769,856 branch-misses # 18.727 M/sec (0.16%)
> 16,111 branch-misses # 0.080 M/sec (99.84%)
>
> 0.201895853 seconds time elapsed
>
> We can see two events are created for one hardware event.
> First one is core event the second one is atom event.

Can we have that (core/atom) as a prefix or in the comment area?

> One thing is, the shadow stats looks a bit different, now it's just
> 'M/sec'.
>
> The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
> need to be improved in future if we want to get the original shadow
> stats.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/builtin-stat.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index 44d1a5f..0b08665 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1137,6 +1137,13 @@ static int parse_hybrid_type(const struct option *opt,
> return 0;
> }
>
> +static int add_default_hybrid_events(struct evlist *evlist)
> +{
> + struct parse_events_error err;
> +
> + return parse_events(evlist, "cycles,instructions,branches,branch-misses", &err);
> +}
> +
> static struct option stat_options[] = {
> OPT_BOOLEAN('T', "transaction", &transaction_run,
> "hardware transaction statistics"),
> @@ -1613,6 +1620,12 @@ static int add_default_attributes(void)
> { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
>
> };
> + struct perf_event_attr default_sw_attrs[] = {
> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
> +};
>
> /*
> * Detailed stats (-d), covering the L1 and last level data caches:
> @@ -1849,6 +1862,15 @@ static int add_default_attributes(void)
> }
>
> if (!evsel_list->core.nr_entries) {
> + perf_pmu__scan(NULL);
> + if (perf_pmu__hybrid_exist()) {
> + if (evlist__add_default_attrs(evsel_list,
> + default_sw_attrs) < 0) {
> + return -1;
> + }
> + return add_default_hybrid_events(evsel_list);
> + }
> +
> if (target__has_cpu(&target))
> default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
>
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:37:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 46/49] perf stat: Filter out unmatched aggregation for hybrid event

Em Mon, Feb 08, 2021 at 07:25:43AM -0800, [email protected] escreveu:
> From: Jin Yao <[email protected]>
>
> perf-stat has supported some aggregation modes, such as --per-core,
> --per-socket and etc. While for hybrid event, it may only available
> on part of cpus. So for --per-core, we need to filter out the
> unavailable cores, for --per-socket, filter out the unavailable
> sockets, and so on.
>
> Before:
>
> root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
>
> Performance counter stats for 'system wide':
>
> S0-D0-C0 2 311,114 cycles [cpu_core]

Why not use the pmu style event name, i.e.:

S0-D0-C0 2 311,114 cpu_core/cycles/

?

> S0-D0-C4 2 59,784 cycles [cpu_core]
> S0-D0-C8 2 121,287 cycles [cpu_core]
> S0-D0-C12 2 2,690,245 cycles [cpu_core]
> S0-D0-C16 2 2,060,545 cycles [cpu_core]
> S0-D0-C20 2 3,632,251 cycles [cpu_core]
> S0-D0-C24 2 775,736 cycles [cpu_core]
> S0-D0-C28 2 742,020 cycles [cpu_core]
> S0-D0-C32 0 <not counted> cycles [cpu_core]
> S0-D0-C33 0 <not counted> cycles [cpu_core]
> S0-D0-C34 0 <not counted> cycles [cpu_core]
> S0-D0-C35 0 <not counted> cycles [cpu_core]
> S0-D0-C36 0 <not counted> cycles [cpu_core]
> S0-D0-C37 0 <not counted> cycles [cpu_core]
> S0-D0-C38 0 <not counted> cycles [cpu_core]
> S0-D0-C39 0 <not counted> cycles [cpu_core]
>
> 1.001779842 seconds time elapsed
>
> After:
>
> root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
>
> Performance counter stats for 'system wide':
>
> S0-D0-C0 2 1,088,230 cycles [cpu_core]
> S0-D0-C4 2 57,228 cycles [cpu_core]
> S0-D0-C8 2 98,327 cycles [cpu_core]
> S0-D0-C12 2 2,741,955 cycles [cpu_core]
> S0-D0-C16 2 2,090,432 cycles [cpu_core]
> S0-D0-C20 2 3,192,108 cycles [cpu_core]
> S0-D0-C24 2 2,910,752 cycles [cpu_core]
> S0-D0-C28 2 388,696 cycles [cpu_core]
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Jin Yao <[email protected]>
> ---
> tools/perf/util/stat-display.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
> index 21a3f80..fa11572 100644
> --- a/tools/perf/util/stat-display.c
> +++ b/tools/perf/util/stat-display.c
> @@ -630,6 +630,20 @@ static void aggr_cb(struct perf_stat_config *config,
> }
> }
>
> +static bool aggr_id_hybrid_matched(struct perf_stat_config *config,
> + struct evsel *counter, struct aggr_cpu_id id)
> +{
> + struct aggr_cpu_id s;
> +
> + for (int i = 0; i < evsel__nr_cpus(counter); i++) {
> + s = config->aggr_get_id(config, evsel__cpus(counter), i);
> + if (cpu_map__compare_aggr_cpu_id(s, id))
> + return true;
> + }
> +
> + return false;
> +}
> +
> static void print_counter_aggrdata(struct perf_stat_config *config,
> struct evsel *counter, int s,
> char *prefix, bool metric_only,
> @@ -643,6 +657,12 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
> double uval;
>
> ad.id = id = config->aggr_map->map[s];
> +
> + if (perf_pmu__hybrid_exist() &&
> + !aggr_id_hybrid_matched(config, counter, id)) {
> + return;
> + }
> +
> ad.val = ad.ena = ad.run = 0;
> ad.nr = 0;
> if (!collect_data(config, counter, aggr_cb, &ad))
> --
> 2.7.4
>

--

- Arnaldo

2021-02-08 20:39:28

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 02/49] x86/cpu: Describe hybrid CPUs in cpuinfo_x86

On Mon, Feb 08, 2021 at 11:10:18AM -0800, Luck, Tony wrote:
> > +u32 x86_read_hybrid_type(void)
> > +{
> > + if (cpu_feature_enabled(X86_FEATURE_HYBRID_CPU))
> > + return cpuid_eax(0x0000001a);
> > +
> > + return 0;
> > +}
>
> Machine check logging will want to include this in "struct mce".
>
> But ok to pick it up with a function like you describe above.

Sure, that looks ok.

We can always lift it up into cpuinfo_x86 later, when it is needed on
the majority of machines but right now it is only a small subset of
machines which have this.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2021-02-09 00:08:53

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 28/49] perf pmu: Save detected hybrid pmus to a global pmu list

Hi Arnaldo,

On 2/9/2021 2:55 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:25AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> We identify the cpu_core pmu and cpu_atom pmu by explicitly
>> checking following files:
>>
>> For cpu_core, check:
>> "/sys/bus/event_source/devices/cpu_core/cpus"
>>
>> For cpu_atom, check:
>> "/sys/bus/event_source/devices/cpu_atom/cpus"
>>
>> If the 'cpus' file exists, the pmu exists.
>>
>> But in order not to hardcode the "cpu_core" and "cpu_atom",
>> and make the code generic, if the path
>> "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the hybrid
>> pmu exists. All the detected hybrid pmus are linked to a
>> global list 'perf_pmu__hybrid_pmus' and then next we just need
>> to iterate the list by using perf_pmu__for_each_hybrid_pmus.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/pmu.c | 21 +++++++++++++++++++++
>> tools/perf/util/pmu.h | 7 +++++++
>> 2 files changed, 28 insertions(+)
>>
>> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
>> index 0c25457..e97b121 100644
>> --- a/tools/perf/util/pmu.c
>> +++ b/tools/perf/util/pmu.c
>> @@ -27,6 +27,7 @@
>> #include "fncache.h"
>>
>> struct perf_pmu perf_pmu__fake;
>> +LIST_HEAD(perf_pmu__hybrid_pmus);
>>
>> struct perf_pmu_format {
>> char *name;
>> @@ -633,11 +634,27 @@ static struct perf_cpu_map *pmu_cpumask(const char *name)
>> return NULL;
>> }
>>
>> +static bool pmu_is_hybrid(const char *name)
>> +{
>> + char path[PATH_MAX];
>> + const char *sysfs;
>> +
>> + if (strncmp(name, "cpu_", 4))
>> + return false;
>> +
>> + sysfs = sysfs__mountpoint();
>
> Its extremely unlikely that sysfs isn't mounted, but if so, this will
> NULL deref, so please do as other sysfs__mountpoint() uses in
> tools/perf/util/pmu.c and check if sysfs is NULL, returning false, i.e.
> file isn't available.
>

Yes, I need to check the return value of sysfs__mountpoint(), something like.

sysfs = sysfs__mountpoint();
if (!sysfs)
return false;

>> + snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, name);
>> + return file_available(path);
>> +}
>> +
>> static bool pmu_is_uncore(const char *name)
>> {
>> char path[PATH_MAX];
>> const char *sysfs;
>>
>> + if (pmu_is_hybrid(name))
>> + return false;
>> +
>> sysfs = sysfs__mountpoint();
>> snprintf(path, PATH_MAX, CPUS_TEMPLATE_UNCORE, sysfs, name);
>> return file_available(path);
>> @@ -951,6 +968,7 @@ static struct perf_pmu *pmu_lookup(const char *name)
>> pmu->is_uncore = pmu_is_uncore(name);
>> if (pmu->is_uncore)
>> pmu->id = pmu_id(name);
>> + pmu->is_hybrid = pmu_is_hybrid(name);
>> pmu->max_precise = pmu_max_precise(name);
>> pmu_add_cpu_aliases(&aliases, pmu);
>> pmu_add_sys_aliases(&aliases, pmu);
>> @@ -962,6 +980,9 @@ static struct perf_pmu *pmu_lookup(const char *name)
>> list_splice(&aliases, &pmu->aliases);
>> list_add_tail(&pmu->list, &pmus);
>>
>> + if (pmu->is_hybrid)
>> + list_add_tail(&pmu->hybrid_list, &perf_pmu__hybrid_pmus);
>> +
>> pmu->default_config = perf_pmu__get_default_config(pmu);
>>
>> return pmu;
>> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
>> index 0e724d5..99bdb5d 100644
>> --- a/tools/perf/util/pmu.h
>> +++ b/tools/perf/util/pmu.h
>> @@ -5,6 +5,7 @@
>> #include <linux/bitmap.h>
>> #include <linux/compiler.h>
>> #include <linux/perf_event.h>
>> +#include <linux/list.h>
>> #include <stdbool.h>
>> #include "parse-events.h"
>> #include "pmu-events/pmu-events.h"
>> @@ -34,6 +35,7 @@ struct perf_pmu {
>> __u32 type;
>> bool selectable;
>> bool is_uncore;
>> + bool is_hybrid;
>> bool auxtrace;
>> int max_precise;
>> struct perf_event_attr *default_config;
>> @@ -42,9 +44,11 @@ struct perf_pmu {
>> struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
>> struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
>> struct list_head list; /* ELEM */
>> + struct list_head hybrid_list;
>> };
>>
>> extern struct perf_pmu perf_pmu__fake;
>> +extern struct list_head perf_pmu__hybrid_pmus;
>>
>> struct perf_pmu_info {
>> const char *unit;
>> @@ -124,4 +128,7 @@ int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
>>
>> int perf_pmu__caps_parse(struct perf_pmu *pmu);
>>
>> +#define perf_pmu__for_each_hybrid_pmus(pmu) \
>
> singular, i.e.
>
> #define perf_pmu__for_each_hybrid_pmu(pmu) \
>

Got it. Will use perf_pmu__for_each_hybrid_pmu in next version.

Thanks
Jin Yao

>> + list_for_each_entry(pmu, &perf_pmu__hybrid_pmus, hybrid_list)
>> +
>> #endif /* __PMU_H */
>> --
>> 2.7.4
>>
>

2021-02-09 00:21:46

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 27/49] perf util: Save pmu name to struct perf_pmu_alias

Hi Arnaldo,

On 2/9/2021 2:57 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:24AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> On hybrid platform, one event is available on one pmu
>> (such as, cpu_core or cpu_atom).
>>
>> This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
>> Then next we can know the pmu where the event can be enabled.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/pmu.c | 17 +++++++++++++----
>> tools/perf/util/pmu.h | 1 +
>> 2 files changed, 14 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
>> index 44ef283..0c25457 100644
>> --- a/tools/perf/util/pmu.c
>> +++ b/tools/perf/util/pmu.c
>> @@ -283,6 +283,7 @@ void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
>> zfree(&newalias->str);
>> zfree(&newalias->metric_expr);
>> zfree(&newalias->metric_name);
>> + zfree(&newalias->pmu);
>> parse_events_terms__purge(&newalias->terms);
>> free(newalias);
>> }
>> @@ -297,6 +298,10 @@ static bool perf_pmu_merge_alias(struct perf_pmu_alias *newalias,
>>
>> list_for_each_entry(a, alist, list) {
>> if (!strcasecmp(newalias->name, a->name)) {
>> + if (newalias->pmu && a->pmu &&
>> + !strcasecmp(newalias->pmu, a->pmu)) {
>> + continue;
>> + }
>> perf_pmu_update_alias(a, newalias);
>> perf_pmu_free_alias(newalias);
>> return true;
>> @@ -311,7 +316,8 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
>> char *unit, char *perpkg,
>> char *metric_expr,
>> char *metric_name,
>> - char *deprecated)
>> + char *deprecated,
>> + char *pmu)
>> {
>> struct parse_events_term *term;
>> struct perf_pmu_alias *alias;
>> @@ -382,6 +388,7 @@ static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
>> }
>> alias->per_pkg = perpkg && sscanf(perpkg, "%d", &num) == 1 && num == 1;
>> alias->str = strdup(newval);
>> + alias->pmu = pmu ? strdup(pmu) : NULL;
>>
>> if (deprecated)
>> alias->deprecated = true;
>> @@ -407,7 +414,7 @@ static int perf_pmu__new_alias(struct list_head *list, char *dir, char *name, FI
>> strim(buf);
>>
>> return __perf_pmu__new_alias(list, dir, name, NULL, buf, NULL, NULL, NULL,
>> - NULL, NULL, NULL, NULL);
>> + NULL, NULL, NULL, NULL, NULL);
>> }
>>
>> static inline bool pmu_alias_info_file(char *name)
>> @@ -797,7 +804,8 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
>> (char *)pe->unit, (char *)pe->perpkg,
>> (char *)pe->metric_expr,
>> (char *)pe->metric_name,
>> - (char *)pe->deprecated);
>> + (char *)pe->deprecated,
>> + (char *)pe->pmu);
>> }
>> }
>>
>> @@ -870,7 +878,8 @@ static int pmu_add_sys_aliases_iter_fn(struct pmu_event *pe, void *data)
>> (char *)pe->perpkg,
>> (char *)pe->metric_expr,
>> (char *)pe->metric_name,
>> - (char *)pe->deprecated);
>> + (char *)pe->deprecated,
>> + NULL);
>
> At some point I think passing the whole 'struct pme_event' pointer
> should be better?
>

Yes, I'm thinking the changes as following,

Before:

static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
char *desc, char *val,
char *long_desc, char *topic,
char *unit, char *perpkg,
char *metric_expr,
char *metric_name,
char *deprecated);

After:

static int __perf_pmu__new_alias(struct list_head *list, char *dir, char *name,
char *desc, char *val,
struct pmu_event *pe);

That looks much simpler than before.

Thanks
Jin Yao

>> }
>>
>> return 0;
>> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
>> index 8164388..0e724d5 100644
>> --- a/tools/perf/util/pmu.h
>> +++ b/tools/perf/util/pmu.h
>> @@ -72,6 +72,7 @@ struct perf_pmu_alias {
>> bool deprecated;
>> char *metric_expr;
>> char *metric_name;
>> + char *pmu;
>> };
>>
>> struct perf_pmu *perf_pmu__find(const char *name);
>> --
>> 2.7.4
>>
>

2021-02-09 00:27:17

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 35/49] perf parse-events: Create two hybrid hardware events

Hi Arnaldo,

On 2/9/2021 2:59 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:32AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> For hardware events, they have pre-defined configs. The kernel
>> needs to know where the event comes from (e.g. from cpu_core pmu
>> or from cpu_atom pmu). But the perf type 'PERF_TYPE_HARDWARE'
>> can't carry pmu information.
>>
>> So the kernel introduces a new type 'PERF_TYPE_HARDWARE_PMU'.
>> The new attr.config layout for PERF_TYPE_HARDWARE_PMU is:
>>
>> 0xDD000000AA
>> AA: original hardware event ID
>> DD: PMU type ID
>>
>> PMU type ID is retrieved from sysfs. For example,
>>
>> cat /sys/devices/cpu_atom/type
>> 10
>>
>> cat /sys/devices/cpu_core/type
>> 4
>>
>> When enabling a hybrid hardware event without specified pmu, such as,
>> 'perf stat -e cycles -a', two events are created automatically. One
>> is for atom, the other is for core.
>
> please move the command output two chars to the right, otherwise lines
> with --- may confuse some scripts.
>

oh, very sorry about that. I will be careful in next version.

Thanks
Jin Yao

>> root@otcpl-adl-s-2:~# ./perf stat -e cycles -vv -a -- sleep 1
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0x400000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
>> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
>> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
>> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
>> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
>> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
>> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
>> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
>> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
>> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
>> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
>> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
>> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
>> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
>> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
>> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
>> cycles: 0: 1254337 1001292571 1001292571
>> cycles: 1: 2595141 1001279813 1001279813
>> cycles: 2: 134853 1001276406 1001276406
>> cycles: 3: 81119 1001271089 1001271089
>> cycles: 4: 251353 1001264678 1001264678
>> cycles: 5: 415593 1001259163 1001259163
>> cycles: 6: 129643 1001265312 1001265312
>> cycles: 7: 80289 1001258979 1001258979
>> cycles: 8: 169983 1001251207 1001251207
>> cycles: 9: 81981 1001245487 1001245487
>> cycles: 10: 4116221 1001245537 1001245537
>> cycles: 11: 85531 1001253097 1001253097
>> cycles: 12: 3969132 1001254270 1001254270
>> cycles: 13: 96006 1001254691 1001254691
>> cycles: 14: 385004 1001244971 1001244971
>> cycles: 15: 394446 1001251437 1001251437
>> cycles: 0: 427330 1001253457 1001253457
>> cycles: 1: 444043 1001255914 1001255914
>> cycles: 2: 97285 1001253555 1001253555
>> cycles: 3: 92071 1001260556 1001260556
>> cycles: 4: 86292 1001249896 1001249896
>> cycles: 5: 236851 1001238979 1001238979
>> cycles: 6: 100081 1001239792 1001239792
>> cycles: 7: 72836 1001243276 1001243276
>> cycles: 14240632 16020168708 16020168708
>> cycles: 1556789 8009995425 8009995425
>>
>> Performance counter stats for 'system wide':
>>
>> 14,240,632 cycles
>> 1,556,789 cycles
>>
>> 1.002261231 seconds time elapsed
>>
>> type 6 is PERF_TYPE_HARDWARE_PMU.
>> 0x4 in 0x400000000 indicates the cpu_core pmu.
>> 0xa in 0xa00000000 indicates the cpu_atom pmu.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/parse-events.c | 73 ++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 73 insertions(+)
>>
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index 81a6fce..1e767dc 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -446,6 +446,24 @@ static int config_attr(struct perf_event_attr *attr,
>> struct parse_events_error *err,
>> config_term_func_t config_term);
>>
>> +static void config_hybrid_attr(struct perf_event_attr *attr,
>> + int type, int pmu_type)
>> +{
>> + /*
>> + * attr.config layout:
>> + * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
>> + * AA: hardware event ID
>> + * DD: PMU type ID
>> + * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
>> + * AA: hardware cache ID
>> + * BB: hardware cache op ID
>> + * CC: hardware cache op result ID
>> + * DD: PMU type ID
>> + */
>> + attr->type = type;
>> + attr->config = attr->config | ((__u64)pmu_type << PERF_PMU_TYPE_SHIFT);
>> +}
>> +
>> int parse_events_add_cache(struct list_head *list, int *idx,
>> char *type, char *op_result1, char *op_result2,
>> struct parse_events_error *err,
>> @@ -1409,6 +1427,47 @@ int parse_events_add_tracepoint(struct list_head *list, int *idx,
>> err, head_config);
>> }
>>
>> +static int create_hybrid_hw_event(struct parse_events_state *parse_state,
>> + struct list_head *list,
>> + struct perf_event_attr *attr,
>> + struct perf_pmu *pmu)
>> +{
>> + struct evsel *evsel;
>> + __u32 type = attr->type;
>> + __u64 config = attr->config;
>> +
>> + config_hybrid_attr(attr, PERF_TYPE_HARDWARE_PMU, pmu->type);
>> + evsel = __add_event(list, &parse_state->idx, attr, true, NULL,
>> + pmu, NULL, false, NULL);
>> + if (evsel)
>> + evsel->pmu_name = strdup(pmu->name);
>> + else
>> + return -ENOMEM;
>> +
>> + attr->type = type;
>> + attr->config = config;
>> + return 0;
>> +}
>> +
>> +static int add_hybrid_numeric(struct parse_events_state *parse_state,
>> + struct list_head *list,
>> + struct perf_event_attr *attr,
>> + bool *hybrid)
>> +{
>> + struct perf_pmu *pmu;
>> + int ret;
>> +
>> + *hybrid = false;
>> + perf_pmu__for_each_hybrid_pmus(pmu) {
>> + *hybrid = true;
>> + ret = create_hybrid_hw_event(parse_state, list, attr, pmu);
>> + if (ret)
>> + return ret;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> int parse_events_add_numeric(struct parse_events_state *parse_state,
>> struct list_head *list,
>> u32 type, u64 config,
>> @@ -1416,6 +1475,8 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
>> {
>> struct perf_event_attr attr;
>> LIST_HEAD(config_terms);
>> + bool hybrid;
>> + int ret;
>>
>> memset(&attr, 0, sizeof(attr));
>> attr.type = type;
>> @@ -1430,6 +1491,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
>> return -ENOMEM;
>> }
>>
>> + /*
>> + * Skip the software dummy event.
>> + */
>> + if (type != PERF_TYPE_SOFTWARE) {
>> + if (!perf_pmu__hybrid_exist())
>> + perf_pmu__scan(NULL);
>> +
>> + ret = add_hybrid_numeric(parse_state, list, &attr, &hybrid);
>> + if (hybrid)
>> + return ret;
>> + }
>> +
>> return add_event(list, &parse_state->idx, &attr,
>> get_config_name(head_config), &config_terms);
>> }
>> --
>> 2.7.4
>>
>

2021-02-09 00:31:15

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 39/49] perf parse-events: Support hybrid raw events

Hi Arnaldo,

On 2/9/2021 3:07 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:36AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> On hybrid platform, same raw event is possible to be available on
>> both cpu_core pmu and cpu_atom pmu. So it's supported to create
>> two raw events for one event encoding.
>>
>> root@otcpl-adl-s-2:~# ./perf stat -e r3c -a -vv -- sleep 1
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>
> please move thie command outout two chars to the right
>

OK, I will make sure the command output two chars to the right in next version.

Thanks
Jin Yao

>> perf_event_attr:
>> type 4
>> size 120
>> config 0x3c
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
>> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
>> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
>> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
>> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
>> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
>> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
>> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
>> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 10
>> size 120
>> config 0x3c
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
>> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
>> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
>> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
>> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
>> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
>> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
>> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
>> ...
>>
>> Performance counter stats for 'system wide':
>>
>> 13,107,070 r3c
>> 316,562 r3c
>>
>> 1.002161379 seconds time elapsed
>>
>> It also supports the raw event inside pmu. Syntax is similar:
>>
>> cpu_core/<raw event>/
>> cpu_atom/<raw event>/
>>
>> root@otcpl-adl-s-2:~# ./perf stat -e cpu_core/r3c/ -vv -- ./triad_loop
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 4
>> size 120
>> config 0x3c
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 23641 cpu -1 group_fd -1 flags 0x8 = 3
>> cpu_core/r3c/: 0: 401407363 102724005 102724005
>> cpu_core/r3c/: 401407363 102724005 102724005
>>
>> Performance counter stats for './triad_loop':
>>
>> 401,407,363 cpu_core/r3c/
>>
>> 0.103186241 seconds time elapsed
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 55 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index ddf6f79..6d7a2ce 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1532,6 +1532,55 @@ static int add_hybrid_numeric(struct parse_events_state *parse_state,
>> return 0;
>> }
>>
>> +static int create_hybrid_raw_event(struct parse_events_state *parse_state,
>> + struct list_head *list,
>> + struct perf_event_attr *attr,
>> + struct list_head *head_config,
>> + struct list_head *config_terms,
>> + struct perf_pmu *pmu)
>> +{
>> + struct evsel *evsel;
>> +
>> + attr->type = pmu->type;
>> + evsel = __add_event(list, &parse_state->idx, attr, true,
>> + get_config_name(head_config),
>> + pmu, config_terms, false, NULL);
>> + if (evsel)
>> + evsel->pmu_name = strdup(pmu->name);
>> + else
>> + return -ENOMEM;
>> +
>> + return 0;
>> +}
>> +
>> +static int add_hybrid_raw(struct parse_events_state *parse_state,
>> + struct list_head *list,
>> + struct perf_event_attr *attr,
>> + struct list_head *head_config,
>> + struct list_head *config_terms,
>> + bool *hybrid)
>> +{
>> + struct perf_pmu *pmu;
>> + int ret;
>> +
>> + *hybrid = false;
>> + perf_pmu__for_each_hybrid_pmus(pmu) {
>> + *hybrid = true;
>> + if (parse_state->pmu_name &&
>> + strcmp(parse_state->pmu_name, pmu->name)) {
>> + continue;
>> + }
>> +
>> + ret = create_hybrid_raw_event(parse_state, list, attr,
>> + head_config, config_terms,
>> + pmu);
>> + if (ret)
>> + return ret;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> int parse_events_add_numeric(struct parse_events_state *parse_state,
>> struct list_head *list,
>> u32 type, u64 config,
>> @@ -1558,7 +1607,12 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
>> /*
>> * Skip the software dummy event.
>> */
>> - if (type != PERF_TYPE_SOFTWARE) {
>> + if (type == PERF_TYPE_RAW) {
>> + ret = add_hybrid_raw(parse_state, list, &attr, head_config,
>> + &config_terms, &hybrid);
>> + if (hybrid)
>> + return ret;
>> + } else if (type != PERF_TYPE_SOFTWARE) {
>> if (!perf_pmu__hybrid_exist())
>> perf_pmu__scan(NULL);
>>
>> --
>> 2.7.4
>>
>

2021-02-09 00:31:33

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 32/49] perf header: Support HYBRID_TOPOLOGY feature

Hi Arnaldo,

On 2/9/2021 3:05 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:29AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> It would be useful to let user know the hybrid topology.
>> For example, the HYBRID_TOPOLOGY feature in header indicates which
>> cpus are core cpus, and which cpus are atom cpus.
>
> Can you please update tools/perf/Documentation/perf.data-file-format.txt
> ?
>

OK, I will update perf.data-file-format.txt in next version.

>> With this patch,
>
>> On a hybrid platform:
>>
>> root@otcpl-adl-s-2:~# ./perf report --header-only -I
>> ...
>> # cpu_core cpu list : 0-15
>> # cpu_atom cpu list : 16-23
>>
>> On a non-hybrid platform:
>>
>> root@kbl-ppc:~# ./perf report --header-only -I
>> ...
>> # missing features: TRACING_DATA BRANCH_STACK GROUP_DESC AUXTRACE STAT CLOCKID DIR_FORMAT COMPRESSED CLOCK_DATA HYBRID_TOPOLOGY
>>
>> It just shows HYBRID_TOPOLOGY is missing feature.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/cputopo.c | 80 +++++++++++++++++++++++++++++++++++++++++
>> tools/perf/util/cputopo.h | 13 +++++++
>> tools/perf/util/env.c | 6 ++++
>> tools/perf/util/env.h | 7 ++++
>> tools/perf/util/header.c | 92 +++++++++++++++++++++++++++++++++++++++++++++++
>> tools/perf/util/header.h | 1 +
>> tools/perf/util/pmu.c | 1 -
>> tools/perf/util/pmu.h | 1 +
>> 8 files changed, 200 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/cputopo.c b/tools/perf/util/cputopo.c
>> index 1b52402..4a00fb8 100644
>> --- a/tools/perf/util/cputopo.c
>> +++ b/tools/perf/util/cputopo.c
>> @@ -12,6 +12,7 @@
>> #include "cpumap.h"
>> #include "debug.h"
>> #include "env.h"
>> +#include "pmu.h"
>>
>> #define CORE_SIB_FMT \
>> "%s/devices/system/cpu/cpu%d/topology/core_siblings_list"
>> @@ -351,3 +352,82 @@ void numa_topology__delete(struct numa_topology *tp)
>>
>> free(tp);
>> }
>> +
>> +static int load_hybrid_node(struct hybrid_topology_node *node,
>> + struct perf_pmu *pmu)
>> +{
>> + const char *sysfs;
>> + char path[PATH_MAX];
>> + char *buf = NULL, *p;
>> + FILE *fp;
>> + size_t len = 0;
>> +
>> + node->pmu_name = strdup(pmu->name);
>> + if (!node->pmu_name)
>> + return -1;
>> +
>> + sysfs = sysfs__mountpoint();
>
> Check for NULL
>

Yes, my fault, I need to check for NULL.

Thanks
Jin Yao

>> + snprintf(path, PATH_MAX, CPUS_TEMPLATE_CPU, sysfs, pmu->name);
>> +
>> + fp = fopen(path, "r");
>> + if (!fp)
>> + goto err;
>> +
>> + if (getline(&buf, &len, fp) <= 0) {
>> + fclose(fp);
>> + goto err;
>> + }
>> +
>> + p = strchr(buf, '\n');
>> + if (p)
>> + *p = '\0';
>> +
>> + fclose(fp);
>> + node->cpus = buf;
>> + return 0;
>> +
>> +err:
>> + zfree(&node->pmu_name);
>> + free(buf);
>> + return -1;
>> +}
>> +
>> +struct hybrid_topology *hybrid_topology__new(void)
>> +{
>> + struct perf_pmu *pmu;
>> + struct hybrid_topology *tp = NULL;
>> + u32 nr = 0, i = 0;
>> +
>> + perf_pmu__for_each_hybrid_pmus(pmu)
>> + nr++;
>> +
>> + if (nr == 0)
>> + return NULL;
>> +
>> + tp = zalloc(sizeof(*tp) + sizeof(tp->nodes[0]) * nr);
>> + if (!tp)
>> + return NULL;
>> +
>> + tp->nr = nr;
>> + perf_pmu__for_each_hybrid_pmus(pmu) {
>> + if (load_hybrid_node(&tp->nodes[i], pmu)) {
>> + hybrid_topology__delete(tp);
>> + return NULL;
>> + }
>> + i++;
>> + }
>> +
>> + return tp;
>> +}
>> +
>> +void hybrid_topology__delete(struct hybrid_topology *tp)
>> +{
>> + u32 i;
>> +
>> + for (i = 0; i < tp->nr; i++) {
>> + zfree(&tp->nodes[i].pmu_name);
>> + zfree(&tp->nodes[i].cpus);
>> + }
>> +
>> + free(tp);
>> +}
>> diff --git a/tools/perf/util/cputopo.h b/tools/perf/util/cputopo.h
>> index 6201c37..d9af971 100644
>> --- a/tools/perf/util/cputopo.h
>> +++ b/tools/perf/util/cputopo.h
>> @@ -25,10 +25,23 @@ struct numa_topology {
>> struct numa_topology_node nodes[];
>> };
>>
>> +struct hybrid_topology_node {
>> + char *pmu_name;
>> + char *cpus;
>> +};
>> +
>> +struct hybrid_topology {
>> + u32 nr;
>> + struct hybrid_topology_node nodes[];
>> +};
>> +
>> struct cpu_topology *cpu_topology__new(void);
>> void cpu_topology__delete(struct cpu_topology *tp);
>>
>> struct numa_topology *numa_topology__new(void);
>> void numa_topology__delete(struct numa_topology *tp);
>>
>> +struct hybrid_topology *hybrid_topology__new(void);
>> +void hybrid_topology__delete(struct hybrid_topology *tp);
>> +
>> #endif /* __PERF_CPUTOPO_H */
>> diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
>> index 9130f6f..9e05eca 100644
>> --- a/tools/perf/util/env.c
>> +++ b/tools/perf/util/env.c
>> @@ -202,6 +202,12 @@ void perf_env__exit(struct perf_env *env)
>> for (i = 0; i < env->nr_memory_nodes; i++)
>> zfree(&env->memory_nodes[i].set);
>> zfree(&env->memory_nodes);
>> +
>> + for (i = 0; i < env->nr_hybrid_nodes; i++) {
>> + perf_cpu_map__put(env->hybrid_nodes[i].map);
>> + zfree(&env->hybrid_nodes[i].pmu_name);
>> + }
>> + zfree(&env->hybrid_nodes);
>> }
>>
>> void perf_env__init(struct perf_env *env __maybe_unused)
>> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
>> index ca249bf..9ca7633 100644
>> --- a/tools/perf/util/env.h
>> +++ b/tools/perf/util/env.h
>> @@ -37,6 +37,11 @@ struct memory_node {
>> unsigned long *set;
>> };
>>
>> +struct hybrid_node {
>> + char *pmu_name;
>> + struct perf_cpu_map *map;
>> +};
>> +
>> struct perf_env {
>> char *hostname;
>> char *os_release;
>> @@ -59,6 +64,7 @@ struct perf_env {
>> int nr_pmu_mappings;
>> int nr_groups;
>> int nr_cpu_pmu_caps;
>> + int nr_hybrid_nodes;
>> char *cmdline;
>> const char **cmdline_argv;
>> char *sibling_cores;
>> @@ -77,6 +83,7 @@ struct perf_env {
>> struct numa_node *numa_nodes;
>> struct memory_node *memory_nodes;
>> unsigned long long memory_bsize;
>> + struct hybrid_node *hybrid_nodes;
>> #ifdef HAVE_LIBBPF_SUPPORT
>> /*
>> * bpf_info_lock protects bpf rbtrees. This is needed because the
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index c4ed3dc..6bcd959 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -932,6 +932,40 @@ static int write_clock_data(struct feat_fd *ff,
>> return do_write(ff, data64, sizeof(*data64));
>> }
>>
>> +static int write_hybrid_topology(struct feat_fd *ff,
>> + struct evlist *evlist __maybe_unused)
>> +{
>> + struct hybrid_topology *tp;
>> + int ret;
>> + u32 i;
>> +
>> + tp = hybrid_topology__new();
>> + if (!tp)
>> + return -1;
>> +
>> + ret = do_write(ff, &tp->nr, sizeof(u32));
>> + if (ret < 0)
>> + goto err;
>> +
>> + for (i = 0; i < tp->nr; i++) {
>> + struct hybrid_topology_node *n = &tp->nodes[i];
>> +
>> + ret = do_write_string(ff, n->pmu_name);
>> + if (ret < 0)
>> + goto err;
>> +
>> + ret = do_write_string(ff, n->cpus);
>> + if (ret < 0)
>> + goto err;
>> + }
>> +
>> + ret = 0;
>> +
>> +err:
>> + hybrid_topology__delete(tp);
>> + return ret;
>> +}
>> +
>> static int write_dir_format(struct feat_fd *ff,
>> struct evlist *evlist __maybe_unused)
>> {
>> @@ -1623,6 +1657,19 @@ static void print_clock_data(struct feat_fd *ff, FILE *fp)
>> clockid_name(clockid));
>> }
>>
>> +static void print_hybrid_topology(struct feat_fd *ff, FILE *fp)
>> +{
>> + int i;
>> + struct hybrid_node *n;
>> +
>> + for (i = 0; i < ff->ph->env.nr_hybrid_nodes; i++) {
>> + n = &ff->ph->env.hybrid_nodes[i];
>> +
>> + fprintf(fp, "# %s cpu list : ", n->pmu_name);
>> + cpu_map__fprintf(n->map, fp);
>> + }
>> +}
>> +
>> static void print_dir_format(struct feat_fd *ff, FILE *fp)
>> {
>> struct perf_session *session;
>> @@ -2849,6 +2896,50 @@ static int process_clock_data(struct feat_fd *ff,
>> return 0;
>> }
>>
>> +static int process_hybrid_topology(struct feat_fd *ff,
>> + void *data __maybe_unused)
>> +{
>> + struct hybrid_node *nodes, *n;
>> + u32 nr, i;
>> + char *str;
>> +
>> + /* nr nodes */
>> + if (do_read_u32(ff, &nr))
>> + return -1;
>> +
>> + nodes = zalloc(sizeof(*nodes) * nr);
>> + if (!nodes)
>> + return -ENOMEM;
>> +
>> + for (i = 0; i < nr; i++) {
>> + n = &nodes[i];
>> +
>> + n->pmu_name = do_read_string(ff);
>> + if (!n->pmu_name)
>> + goto error;
>> +
>> + str = do_read_string(ff);
>> + if (!str)
>> + goto error;
>> +
>> + n->map = perf_cpu_map__new(str);
>> + free(str);
>> + if (!n->map)
>> + goto error;
>> + }
>> +
>> + ff->ph->env.nr_hybrid_nodes = nr;
>> + ff->ph->env.hybrid_nodes = nodes;
>> + return 0;
>> +
>> +error:
>> + for (i = 0; i < nr; i++)
>> + free(nodes[i].pmu_name);
>> +
>> + free(nodes);
>> + return -1;
>> +}
>> +
>> static int process_dir_format(struct feat_fd *ff,
>> void *_data __maybe_unused)
>> {
>> @@ -3117,6 +3208,7 @@ const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE] = {
>> FEAT_OPR(COMPRESSED, compressed, false),
>> FEAT_OPR(CPU_PMU_CAPS, cpu_pmu_caps, false),
>> FEAT_OPR(CLOCK_DATA, clock_data, false),
>> + FEAT_OPN(HYBRID_TOPOLOGY, hybrid_topology, true),
>> };
>>
>> struct header_print_data {
>> diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
>> index 2aca717..3f12ec0 100644
>> --- a/tools/perf/util/header.h
>> +++ b/tools/perf/util/header.h
>> @@ -45,6 +45,7 @@ enum {
>> HEADER_COMPRESSED,
>> HEADER_CPU_PMU_CAPS,
>> HEADER_CLOCK_DATA,
>> + HEADER_HYBRID_TOPOLOGY,
>> HEADER_LAST_FEATURE,
>> HEADER_FEAT_BITS = 256,
>> };
>> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
>> index 9a6c973..ca2fc67 100644
>> --- a/tools/perf/util/pmu.c
>> +++ b/tools/perf/util/pmu.c
>> @@ -607,7 +607,6 @@ static struct perf_cpu_map *__pmu_cpumask(const char *path)
>> */
>> #define SYS_TEMPLATE_ID "./bus/event_source/devices/%s/identifier"
>> #define CPUS_TEMPLATE_UNCORE "%s/bus/event_source/devices/%s/cpumask"
>> -#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"
>>
>> static struct perf_cpu_map *pmu_cpumask(const char *name)
>> {
>> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
>> index 5b727cf..ccffc05 100644
>> --- a/tools/perf/util/pmu.h
>> +++ b/tools/perf/util/pmu.h
>> @@ -20,6 +20,7 @@ enum {
>>
>> #define PERF_PMU_FORMAT_BITS 64
>> #define EVENT_SOURCE_DEVICE_PATH "/bus/event_source/devices/"
>> +#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus"
>>
>> struct perf_event_attr;
>>
>> --
>> 2.7.4
>>
>

2021-02-09 00:39:29

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 43/49] perf stat: Add default hybrid events

Hi Arnaldo,

On 2/9/2021 3:10 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:40AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> Previously if '-e' is not specified in perf stat, some software events
>> and hardware events are added to evlist by default.
>>
>> root@otcpl-adl-s-2:~# ./perf stat -- ./triad_loop
>>
>> Performance counter stats for './triad_loop':
>>
>> 109.43 msec task-clock # 0.993 CPUs utilized
>> 1 context-switches # 0.009 K/sec
>> 0 cpu-migrations # 0.000 K/sec
>> 105 page-faults # 0.960 K/sec
>> 401,161,982 cycles # 3.666 GHz
>> 1,601,216,357 instructions # 3.99 insn per cycle
>> 200,217,751 branches # 1829.686 M/sec
>> 14,555 branch-misses # 0.01% of all branches
>>
>> 0.110176860 seconds time elapsed
>>
>> Among the events, cycles, instructions, branches and branch-misses
>> are hardware events.
>>
>> One hybrid platform, two events are created for one hardware event.
>>
>> core cycles,
>> atom cycles,
>> core instructions,
>> atom instructions,
>> core branches,
>> atom branches,
>> core branch-misses,
>> atom branch-misses
>>
>> These events will be added to evlist in order on hybrid platform
>> if '-e' is not set.
>>
>> Since parse_events() has been supported to create two hardware events
>> for one event on hybrid platform, so we just use parse_events(evlist,
>> "cycles,instructions,branches,branch-misses") to create the default
>> events and add them to evlist.
>>
>> After:
>> root@otcpl-adl-s-2:~# ./perf stat -vv -- taskset -c 16 ./triad_loop
>> ...
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> config 0x1
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 3
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> config 0x3
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 4
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> config 0x4
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 5
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> config 0x2
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 7
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0x400000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 8
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 9
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0x400000001
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 10
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000001
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 11
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0x400000004
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 12
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000004
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 13
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0x400000005
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid 27954 cpu -1 group_fd -1 flags 0x8 = 14
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000005
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>> disabled 1
>> inherit 1
>> enable_on_exec 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> ...
>>
>> Performance counter stats for 'taskset -c 16 ./triad_loop':
>>
>> 201.31 msec task-clock # 0.997 CPUs utilized
>> 1 context-switches # 0.005 K/sec
>> 1 cpu-migrations # 0.005 K/sec
>> 166 page-faults # 0.825 K/sec
>> 623,267,134 cycles # 3096.043 M/sec (0.16%)
>> 603,082,383 cycles # 2995.777 M/sec (99.84%)
>> 406,410,481 instructions # 2018.820 M/sec (0.16%)
>> 1,604,213,375 instructions # 7968.837 M/sec (99.84%)
>> 81,444,171 branches # 404.569 M/sec (0.16%)
>> 200,616,430 branches # 996.550 M/sec (99.84%)
>> 3,769,856 branch-misses # 18.727 M/sec (0.16%)
>> 16,111 branch-misses # 0.080 M/sec (99.84%)
>>
>> 0.201895853 seconds time elapsed
>>
>> We can see two events are created for one hardware event.
>> First one is core event the second one is atom event.
>
> Can we have that (core/atom) as a prefix or in the comment area?
>

In next patch "perf stat: Uniquify hybrid event name", it would tell user the pmu which the event
belongs to.

For example, I run the triad_loop on core cpu,

root@ssp-pwrt-002:# ./perf stat -- taskset -c 0 ./triad_loop

Performance counter stats for 'taskset -c 0 ./triad_loop':

287.87 msec task-clock # 0.990 CPUs utilized
30 context-switches # 0.104 K/sec
1 cpu-migrations # 0.003 K/sec
168 page-faults # 0.584 K/sec
450,089,808 cycles [cpu_core] # 1563.496 M/sec
<not counted> cycles [cpu_atom] (0.00%)
1,602,536,074 instructions [cpu_core] # 5566.797 M/sec
<not counted> instructions [cpu_atom] (0.00%)
200,474,560 branches [cpu_core] # 696.397 M/sec
<not counted> branches [cpu_atom] (0.00%)
23,002 branch-misses [cpu_core] # 0.080 M/sec
<not counted> branch-misses [cpu_atom] (0.00%)

We can see cpu_atom is not counted.

Thanks
Jin Yao

>> One thing is, the shadow stats looks a bit different, now it's just
>> 'M/sec'.
>>
>> The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
>> need to be improved in future if we want to get the original shadow
>> stats.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/builtin-stat.c | 22 ++++++++++++++++++++++
>> 1 file changed, 22 insertions(+)
>>
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index 44d1a5f..0b08665 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -1137,6 +1137,13 @@ static int parse_hybrid_type(const struct option *opt,
>> return 0;
>> }
>>
>> +static int add_default_hybrid_events(struct evlist *evlist)
>> +{
>> + struct parse_events_error err;
>> +
>> + return parse_events(evlist, "cycles,instructions,branches,branch-misses", &err);
>> +}
>> +
>> static struct option stat_options[] = {
>> OPT_BOOLEAN('T', "transaction", &transaction_run,
>> "hardware transaction statistics"),
>> @@ -1613,6 +1620,12 @@ static int add_default_attributes(void)
>> { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
>>
>> };
>> + struct perf_event_attr default_sw_attrs[] = {
>> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
>> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
>> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
>> + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
>> +};
>>
>> /*
>> * Detailed stats (-d), covering the L1 and last level data caches:
>> @@ -1849,6 +1862,15 @@ static int add_default_attributes(void)
>> }
>>
>> if (!evsel_list->core.nr_entries) {
>> + perf_pmu__scan(NULL);
>> + if (perf_pmu__hybrid_exist()) {
>> + if (evlist__add_default_attrs(evsel_list,
>> + default_sw_attrs) < 0) {
>> + return -1;
>> + }
>> + return add_default_hybrid_events(evsel_list);
>> + }
>> +
>> if (target__has_cpu(&target))
>> default_attrs0[0].config = PERF_COUNT_SW_CPU_CLOCK;
>>
>> --
>> 2.7.4
>>
>

2021-02-09 00:51:06

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 49/49] perf evsel: Adjust hybrid event and global event mixed group

Hi Arnaldo,

On 2/9/2021 3:12 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:46AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> A group mixed with hybrid event and global event is allowed. For example,
>> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
>>
>> e.g.
>> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
>>
>> The challenge is their available cpus are not fully matched.
>> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
>> is available on CPU16-CPU23.
>>
>> When getting the group id for group member, we must be very careful
>> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
>> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
>> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
>>
>> Another challenge is for group read. The events in group may be not
>> available on all cpus. For example the leader is a software event and
>> it's available on CPU0-CPU1, but the group member is a hybrid event and
>> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
>> we have two events. So we need to change the read size according to
>> the real number of events on that cpu.
>>
>> Let's see examples,
>>
>> root@otcpl-adl-s-2:~# ./perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a -vvv -- sleep 1
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 8
>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 9
>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 10
>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 11
>> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 12
>> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 13
>> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 14
>> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 15
>> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 16
>> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 17
>> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 18
>> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19
>> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20
>> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 21
>> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 22
>> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 23
>> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 24
>> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 25
>> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 26
>> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 16 group_fd 20 flags 0x8 = 28
>> sys_perf_event_open: pid -1 cpu 17 group_fd 21 flags 0x8 = 29
>> sys_perf_event_open: pid -1 cpu 18 group_fd 22 flags 0x8 = 30
>> sys_perf_event_open: pid -1 cpu 19 group_fd 23 flags 0x8 = 31
>> sys_perf_event_open: pid -1 cpu 20 group_fd 24 flags 0x8 = 32
>> sys_perf_event_open: pid -1 cpu 21 group_fd 25 flags 0x8 = 33
>> sys_perf_event_open: pid -1 cpu 22 group_fd 26 flags 0x8 = 34
>> sys_perf_event_open: pid -1 cpu 23 group_fd 27 flags 0x8 = 35
>> cpu-clock: 0: 1001661765 1001663044 1001663044
>> cpu-clock: 1: 1001659407 1001659885 1001659885
>> cpu-clock: 2: 1001646087 1001647302 1001647302
>> cpu-clock: 3: 1001645168 1001645550 1001645550
>> cpu-clock: 4: 1001645052 1001646102 1001646102
>> cpu-clock: 5: 1001643719 1001644472 1001644472
>> cpu-clock: 6: 1001641893 1001642859 1001642859
>> cpu-clock: 7: 1001640524 1001641036 1001641036
>> cpu-clock: 8: 1001637596 1001638076 1001638076
>> cpu-clock: 9: 1001638121 1001638200 1001638200
>> cpu-clock: 10: 1001635825 1001636915 1001636915
>> cpu-clock: 11: 1001633722 1001634276 1001634276
>> cpu-clock: 12: 1001687133 1001686941 1001686941
>> cpu-clock: 13: 1001693663 1001693317 1001693317
>> cpu-clock: 14: 1001693381 1001694407 1001694407
>> cpu-clock: 15: 1001691865 1001692321 1001692321
>> cpu-clock: 16: 1001696621 1001696550 1001696550
>> cpu-clock: 17: 1001699963 1001699822 1001699822
>> cpu-clock: 18: 1001701938 1001701850 1001701850
>> cpu-clock: 19: 1001699298 1001699214 1001699214
>> cpu-clock: 20: 1001691550 1001691026 1001691026
>> cpu-clock: 21: 1001688348 1001688212 1001688212
>> cpu-clock: 22: 1001684907 1001684799 1001684799
>> cpu-clock: 23: 1001680840 1001680780 1001680780
>> cycles: 0: 28175 1001696550 1001696550
>> cycles: 1: 403323 1001699822 1001699822
>> cycles: 2: 35905 1001701850 1001701850
>> cycles: 3: 36755 1001699214 1001699214
>> cycles: 4: 33757 1001691026 1001691026
>> cycles: 5: 37146 1001688212 1001688212
>> cycles: 6: 35483 1001684799 1001684799
>> cycles: 7: 38600 1001680780 1001680780
>> cpu-clock: 24040038386 24040046956 24040046956
>> cycles: 649144 8013542253 8013542253
>>
>> Performance counter stats for 'system wide':
>>
>> 24,040.04 msec cpu-clock # 23.976 CPUs utilized
>> 649,144 cycles [cpu_atom] # 0.027 M/sec
>>
>> 1.002683706 seconds time elapsed
>>
>> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
>> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
>> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
>>
>> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
>> there leaves more works to do.
>>
>> root@otcpl-adl-s-2:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
>> Control descriptor is not initialized
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 6
>> size 120
>> config 0xa00000000
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>> disabled 1
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 3
>> sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 4
>> sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 5
>> sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 7
>> sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 8
>> sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 9
>> sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 10
>> sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 11
>> ------------------------------------------------------------
>> perf_event_attr:
>> type 1
>> size 120
>> sample_type IDENTIFIER
>> read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>> inherit 1
>> exclude_guest 1
>> ------------------------------------------------------------
>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 16
>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 17
>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 18
>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 19
>> sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 20
>> sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 21
>> sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 22
>> sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 23
>> sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 24
>> sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 25
>> sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 26
>> sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 27
>> sys_perf_event_open: pid -1 cpu 16 group_fd 3 flags 0x8 = 28
>> sys_perf_event_open: pid -1 cpu 17 group_fd 4 flags 0x8 = 29
>> sys_perf_event_open: pid -1 cpu 18 group_fd 5 flags 0x8 = 30
>> sys_perf_event_open: pid -1 cpu 19 group_fd 7 flags 0x8 = 31
>> sys_perf_event_open: pid -1 cpu 20 group_fd 8 flags 0x8 = 32
>> sys_perf_event_open: pid -1 cpu 21 group_fd 9 flags 0x8 = 33
>> sys_perf_event_open: pid -1 cpu 22 group_fd 10 flags 0x8 = 34
>> sys_perf_event_open: pid -1 cpu 23 group_fd 11 flags 0x8 = 35
>> cycles: 0: 422260 1001993637 1001993637
>> cycles: 1: 631309 1002039934 1002039934
>> cycles: 2: 309501 1002018065 1002018065
>> cycles: 3: 119279 1002040811 1002040811
>> cycles: 4: 89389 1002039312 1002039312
>> cycles: 5: 155437 1002054794 1002054794
>> cycles: 6: 92420 1002051141 1002051141
>> cycles: 7: 96017 1002073659 1002073659
>> cpu-clock: 0: 0 0 0
>> cpu-clock: 1: 0 0 0
>> cpu-clock: 2: 0 0 0
>> cpu-clock: 3: 0 0 0
>> cpu-clock: 4: 0 0 0
>> cpu-clock: 5: 0 0 0
>> cpu-clock: 6: 0 0 0
>> cpu-clock: 7: 0 0 0
>> cpu-clock: 8: 0 0 0
>> cpu-clock: 9: 0 0 0
>> cpu-clock: 10: 0 0 0
>> cpu-clock: 11: 0 0 0
>> cpu-clock: 12: 0 0 0
>> cpu-clock: 13: 0 0 0
>> cpu-clock: 14: 0 0 0
>> cpu-clock: 15: 0 0 0
>> cpu-clock: 16: 1001997706 1001993637 1001993637
>> cpu-clock: 17: 1002040524 1002039934 1002039934
>> cpu-clock: 18: 1002018570 1002018065 1002018065
>> cpu-clock: 19: 1002041360 1002040811 1002040811
>> cpu-clock: 20: 1002044731 1002039312 1002039312
>> cpu-clock: 21: 1002055355 1002054794 1002054794
>> cpu-clock: 22: 1002051659 1002051141 1002051141
>> cpu-clock: 23: 1002074150 1002073659 1002073659
>> cycles: 1915612 8016311353 8016311353
>> cpu-clock: 8016324055 8016311353 8016311353
>>
>> Performance counter stats for 'system wide':
>>
>> 1,915,612 cycles [cpu_atom] # 0.239 M/sec
>
> I suggested having something like this in a previous patch, when
> creating two 'instructions', etc events, one for cpu_atom and the other
> for cpu_atom, perhaps even use with the PMU style, i.e.
>
> 1,915,612 cpu_atom/cycles/ # 0.239 M/sec
>

OK, I will move this function to previous patch.

For "cycles [cpu_atom]" style, we don't need more code, just set 'stat_config.no_merge = true'.

For "cpu_atom/cycles/" style, please let me think about it.

Thanks
Jin Yao

>> 8,016.32 msec cpu-clock # 7.996 CPUs utilized
>>
>> 1.002545027 seconds time elapsed
>>
>> For cpu-clock, cpu16-cpu23 are set with valid group fd (cpu_atom/cycles/'s
>> fd on that cpu). For counting results, cpu_atom/cycles/ has 8 cpus aggregation
>> , that's correct. But for cpu-clock, it also has 8 cpus aggregation
>> (cpu16-cpu23, not all cpus), the code should be improved. Now one warning
>> is displayed: "WARNING: for cpu-clock, some CPU counts not read".
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/evsel.c | 105 +++++++++++++++++++++++++++++++++++++++++++++---
>> tools/perf/util/stat.h | 1 +
>> 2 files changed, 101 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
>> index 61508cf..65c8cfc8 100644
>> --- a/tools/perf/util/evsel.c
>> +++ b/tools/perf/util/evsel.c
>> @@ -1453,15 +1453,26 @@ static void evsel__set_count(struct evsel *counter, int cpu, int thread, u64 val
>> perf_counts__set_loaded(counter->counts, cpu, thread, true);
>> }
>>
>> -static int evsel__process_group_data(struct evsel *leader, int cpu, int thread, u64 *data)
>> +static int evsel_cpuid_match(struct evsel *evsel1, struct evsel *evsel2,
>> + int cpu)
>> +{
>> + int cpuid;
>> +
>> + cpuid = perf_cpu_map__cpu(evsel1->core.cpus, cpu);
>> + return perf_cpu_map__idx(evsel2->core.cpus, cpuid);
>> +}
>> +
>> +static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
>> + u64 *data, int nr_members)
>> {
>> u64 read_format = leader->core.attr.read_format;
>> struct sample_read_value *v;
>> u64 nr, ena = 0, run = 0, i;
>> + int idx;
>>
>> nr = *data++;
>>
>> - if (nr != (u64) leader->core.nr_members)
>> + if (nr != (u64) nr_members)
>> return -EINVAL;
>>
>> if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
>> @@ -1481,24 +1492,85 @@ static int evsel__process_group_data(struct evsel *leader, int cpu, int thread,
>> if (!counter)
>> return -EINVAL;
>>
>> - evsel__set_count(counter, cpu, thread, v[i].value, ena, run);
>> + if (evsel__is_hybrid_event(counter) ||
>> + evsel__is_hybrid_event(leader)) {
>> + idx = evsel_cpuid_match(leader, counter, cpu);
>> + if (idx == -1)
>> + return -EINVAL;
>> + } else
>> + idx = cpu;
>> +
>> + evsel__set_count(counter, idx, thread, v[i].value, ena, run);
>> }
>>
>> return 0;
>> }
>>
>> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
>> +{
>> + struct evsel *pos;
>> + int nr = 1, back, new_size = 0, idx;
>> +
>> + for_each_group_member(pos, leader) {
>> + idx = evsel_cpuid_match(leader, pos, cpu);
>> + if (idx != -1)
>> + nr++;
>> + }
>> +
>> + if (nr != leader->core.nr_members) {
>> + back = leader->core.nr_members;
>> + leader->core.nr_members = nr;
>> + new_size = perf_evsel__read_size(&leader->core);
>> + leader->core.nr_members = back;
>> + }
>> +
>> + *nr_members = nr;
>> + return new_size;
>> +}
>> +
>> static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>> {
>> struct perf_stat_evsel *ps = leader->stats;
>> u64 read_format = leader->core.attr.read_format;
>> int size = perf_evsel__read_size(&leader->core);
>> + int new_size, nr_members;
>> u64 *data = ps->group_data;
>>
>> if (!(read_format & PERF_FORMAT_ID))
>> return -EINVAL;
>>
>> - if (!evsel__is_group_leader(leader))
>> + if (!evsel__is_group_leader(leader)) {
>> + if (evsel__is_hybrid_event(leader->leader) &&
>> + !evsel__is_hybrid_event(leader)) {
>> + /*
>> + * The group leader is hybrid event and it's
>> + * only available on part of cpus. But the group
>> + * member are available on all cpus. TODO:
>> + * read the counts on the rest of cpus for group
>> + * member.
>> + */
>> + WARN_ONCE(1, "WARNING: for %s, some CPU counts "
>> + "not read\n", leader->name);
>> + return 0;
>> + }
>> return -EINVAL;
>> + }
>> +
>> + /*
>> + * For example the leader is a software event and it's available on
>> + * cpu0-cpu1, but the group member is a hybrid event and it's only
>> + * available on cpu1. For cpu0, we have only one event, but for cpu1
>> + * we have two events. So we need to change the read size according to
>> + * the real number of events on a given cpu.
>> + */
>> + new_size = hybrid_read_size(leader, cpu, &nr_members);
>> + if (new_size)
>> + size = new_size;
>> +
>> + if (ps->group_data && ps->group_data_size < size) {
>> + zfree(&ps->group_data);
>> + data = NULL;
>> + }
>>
>> if (!data) {
>> data = zalloc(size);
>> @@ -1506,6 +1578,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>> return -ENOMEM;
>>
>> ps->group_data = data;
>> + ps->group_data_size = size;
>> }
>>
>> if (FD(leader, cpu, thread) < 0)
>> @@ -1514,7 +1587,7 @@ static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>> if (readn(FD(leader, cpu, thread), data, size) <= 0)
>> return -errno;
>>
>> - return evsel__process_group_data(leader, cpu, thread, data);
>> + return evsel__process_group_data(leader, cpu, thread, data, nr_members);
>> }
>>
>> int evsel__read_counter(struct evsel *evsel, int cpu, int thread)
>> @@ -1561,6 +1634,28 @@ static int get_group_fd(struct evsel *evsel, int cpu, int thread)
>> */
>> BUG_ON(!leader->core.fd);
>>
>> + /*
>> + * If leader is not hybrid event, it's available on
>> + * all cpus (e.g. software event). But hybrid evsel
>> + * member is only available on part of cpus. So need
>> + * to get the leader's fd from correct cpu.
>> + */
>> + if (evsel__is_hybrid_event(evsel) &&
>> + !evsel__is_hybrid_event(leader)) {
>> + cpu = evsel_cpuid_match(evsel, leader, cpu);
>> + BUG_ON(cpu == -1);
>> + }
>> +
>> + /*
>> + * Leader is hybrid event but member is global event.
>> + */
>> + if (!evsel__is_hybrid_event(evsel) &&
>> + evsel__is_hybrid_event(leader)) {
>> + cpu = evsel_cpuid_match(evsel, leader, cpu);
>> + if (cpu == -1)
>> + return -1;
>> + }
>> +
>> fd = FD(leader, cpu, thread);
>> BUG_ON(fd == -1);
>>
>> diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
>> index 80f6715..b96168c 100644
>> --- a/tools/perf/util/stat.h
>> +++ b/tools/perf/util/stat.h
>> @@ -46,6 +46,7 @@ struct perf_stat_evsel {
>> struct stats res_stats[3];
>> enum perf_stat_evsel_id id;
>> u64 *group_data;
>> + int group_data_size;
>> };
>>
>> enum aggr_mode {
>> --
>> 2.7.4
>>
>

2021-02-09 00:57:02

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 46/49] perf stat: Filter out unmatched aggregation for hybrid event

Hi Arnaldo,

On 2/9/2021 3:16 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Feb 08, 2021 at 07:25:43AM -0800, [email protected] escreveu:
>> From: Jin Yao <[email protected]>
>>
>> perf-stat has supported some aggregation modes, such as --per-core,
>> --per-socket and etc. While for hybrid event, it may only available
>> on part of cpus. So for --per-core, we need to filter out the
>> unavailable cores, for --per-socket, filter out the unavailable
>> sockets, and so on.
>>
>> Before:
>>
>> root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
>>
>> Performance counter stats for 'system wide':
>>
>> S0-D0-C0 2 311,114 cycles [cpu_core]
>
> Why not use the pmu style event name, i.e.:
>
> S0-D0-C0 2 311,114 cpu_core/cycles/
>
> ?
>

For "cycles [cpu_core]" style, it's very easy. We just need to set 'stat_config.no_merge = true',
and no more coding work.

Please let me think about a easy way for pmu style event name.

Thanks
Jin Yao

>> S0-D0-C4 2 59,784 cycles [cpu_core]
>> S0-D0-C8 2 121,287 cycles [cpu_core]
>> S0-D0-C12 2 2,690,245 cycles [cpu_core]
>> S0-D0-C16 2 2,060,545 cycles [cpu_core]
>> S0-D0-C20 2 3,632,251 cycles [cpu_core]
>> S0-D0-C24 2 775,736 cycles [cpu_core]
>> S0-D0-C28 2 742,020 cycles [cpu_core]
>> S0-D0-C32 0 <not counted> cycles [cpu_core]
>> S0-D0-C33 0 <not counted> cycles [cpu_core]
>> S0-D0-C34 0 <not counted> cycles [cpu_core]
>> S0-D0-C35 0 <not counted> cycles [cpu_core]
>> S0-D0-C36 0 <not counted> cycles [cpu_core]
>> S0-D0-C37 0 <not counted> cycles [cpu_core]
>> S0-D0-C38 0 <not counted> cycles [cpu_core]
>> S0-D0-C39 0 <not counted> cycles [cpu_core]
>>
>> 1.001779842 seconds time elapsed
>>
>> After:
>>
>> root@otcpl-adl-s-2:~# ./perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1
>>
>> Performance counter stats for 'system wide':
>>
>> S0-D0-C0 2 1,088,230 cycles [cpu_core]
>> S0-D0-C4 2 57,228 cycles [cpu_core]
>> S0-D0-C8 2 98,327 cycles [cpu_core]
>> S0-D0-C12 2 2,741,955 cycles [cpu_core]
>> S0-D0-C16 2 2,090,432 cycles [cpu_core]
>> S0-D0-C20 2 3,192,108 cycles [cpu_core]
>> S0-D0-C24 2 2,910,752 cycles [cpu_core]
>> S0-D0-C28 2 388,696 cycles [cpu_core]
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Jin Yao <[email protected]>
>> ---
>> tools/perf/util/stat-display.c | 20 ++++++++++++++++++++
>> 1 file changed, 20 insertions(+)
>>
>> diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
>> index 21a3f80..fa11572 100644
>> --- a/tools/perf/util/stat-display.c
>> +++ b/tools/perf/util/stat-display.c
>> @@ -630,6 +630,20 @@ static void aggr_cb(struct perf_stat_config *config,
>> }
>> }
>>
>> +static bool aggr_id_hybrid_matched(struct perf_stat_config *config,
>> + struct evsel *counter, struct aggr_cpu_id id)
>> +{
>> + struct aggr_cpu_id s;
>> +
>> + for (int i = 0; i < evsel__nr_cpus(counter); i++) {
>> + s = config->aggr_get_id(config, evsel__cpus(counter), i);
>> + if (cpu_map__compare_aggr_cpu_id(s, id))
>> + return true;
>> + }
>> +
>> + return false;
>> +}
>> +
>> static void print_counter_aggrdata(struct perf_stat_config *config,
>> struct evsel *counter, int s,
>> char *prefix, bool metric_only,
>> @@ -643,6 +657,12 @@ static void print_counter_aggrdata(struct perf_stat_config *config,
>> double uval;
>>
>> ad.id = id = config->aggr_map->map[s];
>> +
>> + if (perf_pmu__hybrid_exist() &&
>> + !aggr_id_hybrid_matched(config, counter, id)) {
>> + return;
>> + }
>> +
>> ad.val = ad.ena = ad.run = 0;
>> ad.nr = 0;
>> if (!collect_data(config, counter, aggr_cb, &ad))
>> --
>> 2.7.4
>>
>

2021-02-09 04:34:27

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 23/49] perf/x86/msr: Add Alder Lake CPU support

Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[cannot apply to tip/master linus/master tip/x86/core v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 32451614da2a9cf4296f90d3606ac77814fb519d
config: x86_64-randconfig-s021-20210209 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-215-g0fb77bb6-dirty
# https://github.com/0day-ci/linux/commit/ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
git checkout ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

arch/x86/events/msr.c: In function 'test_intel':
>> arch/x86/events/msr.c:104:7: error: 'INTEL_FAM6_ALDERLAKE_L' undeclared (first use in this function); did you mean 'INTEL_FAM6_ALDERLAKE'?
104 | case INTEL_FAM6_ALDERLAKE_L:
| ^~~~~~~~~~~~~~~~~~~~~~
| INTEL_FAM6_ALDERLAKE
arch/x86/events/msr.c:104:7: note: each undeclared identifier is reported only once for each function it appears in


vim +104 arch/x86/events/msr.c

39
40 static bool test_intel(int idx, void *data)
41 {
42 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
43 boot_cpu_data.x86 != 6)
44 return false;
45
46 switch (boot_cpu_data.x86_model) {
47 case INTEL_FAM6_NEHALEM:
48 case INTEL_FAM6_NEHALEM_G:
49 case INTEL_FAM6_NEHALEM_EP:
50 case INTEL_FAM6_NEHALEM_EX:
51
52 case INTEL_FAM6_WESTMERE:
53 case INTEL_FAM6_WESTMERE_EP:
54 case INTEL_FAM6_WESTMERE_EX:
55
56 case INTEL_FAM6_SANDYBRIDGE:
57 case INTEL_FAM6_SANDYBRIDGE_X:
58
59 case INTEL_FAM6_IVYBRIDGE:
60 case INTEL_FAM6_IVYBRIDGE_X:
61
62 case INTEL_FAM6_HASWELL:
63 case INTEL_FAM6_HASWELL_X:
64 case INTEL_FAM6_HASWELL_L:
65 case INTEL_FAM6_HASWELL_G:
66
67 case INTEL_FAM6_BROADWELL:
68 case INTEL_FAM6_BROADWELL_D:
69 case INTEL_FAM6_BROADWELL_G:
70 case INTEL_FAM6_BROADWELL_X:
71
72 case INTEL_FAM6_ATOM_SILVERMONT:
73 case INTEL_FAM6_ATOM_SILVERMONT_D:
74 case INTEL_FAM6_ATOM_AIRMONT:
75
76 case INTEL_FAM6_ATOM_GOLDMONT:
77 case INTEL_FAM6_ATOM_GOLDMONT_D:
78 case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
79 case INTEL_FAM6_ATOM_TREMONT_D:
80 case INTEL_FAM6_ATOM_TREMONT:
81 case INTEL_FAM6_ATOM_TREMONT_L:
82
83 case INTEL_FAM6_XEON_PHI_KNL:
84 case INTEL_FAM6_XEON_PHI_KNM:
85 if (idx == PERF_MSR_SMI)
86 return true;
87 break;
88
89 case INTEL_FAM6_SKYLAKE_L:
90 case INTEL_FAM6_SKYLAKE:
91 case INTEL_FAM6_SKYLAKE_X:
92 case INTEL_FAM6_KABYLAKE_L:
93 case INTEL_FAM6_KABYLAKE:
94 case INTEL_FAM6_COMETLAKE_L:
95 case INTEL_FAM6_COMETLAKE:
96 case INTEL_FAM6_ICELAKE_L:
97 case INTEL_FAM6_ICELAKE:
98 case INTEL_FAM6_ICELAKE_X:
99 case INTEL_FAM6_ICELAKE_D:
100 case INTEL_FAM6_TIGERLAKE_L:
101 case INTEL_FAM6_TIGERLAKE:
102 case INTEL_FAM6_ROCKETLAKE:
103 case INTEL_FAM6_ALDERLAKE:
> 104 case INTEL_FAM6_ALDERLAKE_L:
105 if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
106 return true;
107 break;
108 }
109
110 return false;
111 }
112

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (4.23 kB)
.config.gz (38.03 kB)
Download all attachments

2021-02-09 04:40:33

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 22/49] perf/x86/intel/uncore: Add Alder Lake support

Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[cannot apply to tip/master linus/master tip/x86/core v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 32451614da2a9cf4296f90d3606ac77814fb519d
config: x86_64-randconfig-a014-20210209 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# https://github.com/0day-ci/linux/commit/23e3275ab58500ae0ec613f3b65b5c0465a4ac10
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
git checkout 23e3275ab58500ae0ec613f3b65b5c0465a4ac10
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> arch/x86/events/intel/uncore.c:1682:2: error: use of undeclared identifier 'INTEL_FAM6_ALDERLAKE_L'
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_uncore_init),
^
arch/x86/include/asm/cpu_device_id.h:161:39: note: expanded from macro 'X86_MATCH_INTEL_FAM6_MODEL'
X86_MATCH_VENDOR_FAM_MODEL(INTEL, 6, INTEL_FAM6_##model, data)
^
<scratch space>:100:1: note: expanded from here
INTEL_FAM6_ALDERLAKE_L
^
1 error generated.


vim +/INTEL_FAM6_ALDERLAKE_L +1682 arch/x86/events/intel/uncore.c

1644
1645 static const struct x86_cpu_id intel_uncore_match[] __initconst = {
1646 X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EP, &nhm_uncore_init),
1647 X86_MATCH_INTEL_FAM6_MODEL(NEHALEM, &nhm_uncore_init),
1648 X86_MATCH_INTEL_FAM6_MODEL(WESTMERE, &nhm_uncore_init),
1649 X86_MATCH_INTEL_FAM6_MODEL(WESTMERE_EP, &nhm_uncore_init),
1650 X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE, &snb_uncore_init),
1651 X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE, &ivb_uncore_init),
1652 X86_MATCH_INTEL_FAM6_MODEL(HASWELL, &hsw_uncore_init),
1653 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_L, &hsw_uncore_init),
1654 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_G, &hsw_uncore_init),
1655 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL, &bdw_uncore_init),
1656 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_G, &bdw_uncore_init),
1657 X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE_X, &snbep_uncore_init),
1658 X86_MATCH_INTEL_FAM6_MODEL(NEHALEM_EX, &nhmex_uncore_init),
1659 X86_MATCH_INTEL_FAM6_MODEL(WESTMERE_EX, &nhmex_uncore_init),
1660 X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE_X, &ivbep_uncore_init),
1661 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, &hswep_uncore_init),
1662 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, &bdx_uncore_init),
1663 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_D, &bdx_uncore_init),
1664 X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &knl_uncore_init),
1665 X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &knl_uncore_init),
1666 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE, &skl_uncore_init),
1667 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L, &skl_uncore_init),
1668 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &skx_uncore_init),
1669 X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE_L, &skl_uncore_init),
1670 X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE, &skl_uncore_init),
1671 X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L, &skl_uncore_init),
1672 X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &skl_uncore_init),
1673 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L, &icl_uncore_init),
1674 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_NNPI, &icl_uncore_init),
1675 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE, &icl_uncore_init),
1676 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &icx_uncore_init),
1677 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &icx_uncore_init),
1678 X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &tgl_l_uncore_init),
1679 X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &tgl_uncore_init),
1680 X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &rkl_uncore_init),
1681 X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_uncore_init),
> 1682 X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_uncore_init),
1683 X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, &snr_uncore_init),
1684 {},
1685 };
1686 MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match);
1687

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (5.07 kB)
.config.gz (36.41 kB)
Download all attachments

2021-02-09 05:17:54

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 23/49] perf/x86/msr: Add Alder Lake CPU support

Hi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[cannot apply to tip/master linus/master tip/x86/core v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 32451614da2a9cf4296f90d3606ac77814fb519d
config: x86_64-randconfig-a013-20210209 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# https://github.com/0day-ci/linux/commit/ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
git checkout ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> arch/x86/events/msr.c:104:7: error: use of undeclared identifier 'INTEL_FAM6_ALDERLAKE_L'
case INTEL_FAM6_ALDERLAKE_L:
^
1 error generated.


vim +/INTEL_FAM6_ALDERLAKE_L +104 arch/x86/events/msr.c

39
40 static bool test_intel(int idx, void *data)
41 {
42 if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
43 boot_cpu_data.x86 != 6)
44 return false;
45
46 switch (boot_cpu_data.x86_model) {
47 case INTEL_FAM6_NEHALEM:
48 case INTEL_FAM6_NEHALEM_G:
49 case INTEL_FAM6_NEHALEM_EP:
50 case INTEL_FAM6_NEHALEM_EX:
51
52 case INTEL_FAM6_WESTMERE:
53 case INTEL_FAM6_WESTMERE_EP:
54 case INTEL_FAM6_WESTMERE_EX:
55
56 case INTEL_FAM6_SANDYBRIDGE:
57 case INTEL_FAM6_SANDYBRIDGE_X:
58
59 case INTEL_FAM6_IVYBRIDGE:
60 case INTEL_FAM6_IVYBRIDGE_X:
61
62 case INTEL_FAM6_HASWELL:
63 case INTEL_FAM6_HASWELL_X:
64 case INTEL_FAM6_HASWELL_L:
65 case INTEL_FAM6_HASWELL_G:
66
67 case INTEL_FAM6_BROADWELL:
68 case INTEL_FAM6_BROADWELL_D:
69 case INTEL_FAM6_BROADWELL_G:
70 case INTEL_FAM6_BROADWELL_X:
71
72 case INTEL_FAM6_ATOM_SILVERMONT:
73 case INTEL_FAM6_ATOM_SILVERMONT_D:
74 case INTEL_FAM6_ATOM_AIRMONT:
75
76 case INTEL_FAM6_ATOM_GOLDMONT:
77 case INTEL_FAM6_ATOM_GOLDMONT_D:
78 case INTEL_FAM6_ATOM_GOLDMONT_PLUS:
79 case INTEL_FAM6_ATOM_TREMONT_D:
80 case INTEL_FAM6_ATOM_TREMONT:
81 case INTEL_FAM6_ATOM_TREMONT_L:
82
83 case INTEL_FAM6_XEON_PHI_KNL:
84 case INTEL_FAM6_XEON_PHI_KNM:
85 if (idx == PERF_MSR_SMI)
86 return true;
87 break;
88
89 case INTEL_FAM6_SKYLAKE_L:
90 case INTEL_FAM6_SKYLAKE:
91 case INTEL_FAM6_SKYLAKE_X:
92 case INTEL_FAM6_KABYLAKE_L:
93 case INTEL_FAM6_KABYLAKE:
94 case INTEL_FAM6_COMETLAKE_L:
95 case INTEL_FAM6_COMETLAKE:
96 case INTEL_FAM6_ICELAKE_L:
97 case INTEL_FAM6_ICELAKE:
98 case INTEL_FAM6_ICELAKE_X:
99 case INTEL_FAM6_ICELAKE_D:
100 case INTEL_FAM6_TIGERLAKE_L:
101 case INTEL_FAM6_TIGERLAKE:
102 case INTEL_FAM6_ROCKETLAKE:
103 case INTEL_FAM6_ALDERLAKE:
> 104 case INTEL_FAM6_ALDERLAKE_L:
105 if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
106 return true;
107 break;
108 }
109
110 return false;
111 }
112

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (4.26 kB)
.config.gz (36.10 kB)
Download all attachments

2021-02-09 05:25:42

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 25/49] perf/x86/rapl: Add support for Intel Alder Lake

Hi,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/perf/core]
[cannot apply to tip/master linus/master tip/x86/core v5.11-rc6 next-20210125]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 32451614da2a9cf4296f90d3606ac77814fb519d
config: x86_64-randconfig-a014-20210209 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install x86_64 cross compiling tool for clang build
# apt-get install binutils-x86-64-linux-gnu
# https://github.com/0day-ci/linux/commit/f02aa47253758867fa7f74a286fde01ed042ac42
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
git checkout f02aa47253758867fa7f74a286fde01ed042ac42
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> arch/x86/events/rapl.c:804:2: error: use of undeclared identifier 'INTEL_FAM6_ALDERLAKE_L'
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &model_skl),
^
arch/x86/include/asm/cpu_device_id.h:161:39: note: expanded from macro 'X86_MATCH_INTEL_FAM6_MODEL'
X86_MATCH_VENDOR_FAM_MODEL(INTEL, 6, INTEL_FAM6_##model, data)
^
<scratch space>:149:1: note: expanded from here
INTEL_FAM6_ALDERLAKE_L
^
1 error generated.


vim +/INTEL_FAM6_ALDERLAKE_L +804 arch/x86/events/rapl.c

772
773 static const struct x86_cpu_id rapl_model_match[] __initconst = {
774 X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE, &model_snb),
775 X86_MATCH_INTEL_FAM6_MODEL(SANDYBRIDGE_X, &model_snbep),
776 X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE, &model_snb),
777 X86_MATCH_INTEL_FAM6_MODEL(IVYBRIDGE_X, &model_snbep),
778 X86_MATCH_INTEL_FAM6_MODEL(HASWELL, &model_hsw),
779 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, &model_hsx),
780 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_L, &model_hsw),
781 X86_MATCH_INTEL_FAM6_MODEL(HASWELL_G, &model_hsw),
782 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL, &model_hsw),
783 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_G, &model_hsw),
784 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, &model_hsx),
785 X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_D, &model_hsx),
786 X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNL, &model_knl),
787 X86_MATCH_INTEL_FAM6_MODEL(XEON_PHI_KNM, &model_knl),
788 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_L, &model_skl),
789 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE, &model_skl),
790 X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &model_hsx),
791 X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE_L, &model_skl),
792 X86_MATCH_INTEL_FAM6_MODEL(KABYLAKE, &model_skl),
793 X86_MATCH_INTEL_FAM6_MODEL(CANNONLAKE_L, &model_skl),
794 X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT, &model_hsw),
795 X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT_D, &model_hsw),
796 X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT_PLUS, &model_hsw),
797 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L, &model_skl),
798 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE, &model_skl),
799 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &model_hsx),
800 X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &model_hsx),
801 X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L, &model_skl),
802 X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &model_skl),
803 X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &model_skl),
> 804 X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &model_skl),
805 X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &model_spr),
806 X86_MATCH_VENDOR_FAM(AMD, 0x17, &model_amd_fam17h),
807 X86_MATCH_VENDOR_FAM(HYGON, 0x18, &model_amd_fam17h),
808 X86_MATCH_VENDOR_FAM(AMD, 0x19, &model_amd_fam17h),
809 {},
810 };
811 MODULE_DEVICE_TABLE(x86cpu, rapl_model_match);
812

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (4.65 kB)
.config.gz (36.41 kB)
Download all attachments

2021-02-09 13:50:16

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH 23/49] perf/x86/msr: Add Alder Lake CPU support



On 2/8/2021 10:58 PM, kernel test robot wrote:
> Hi,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on tip/perf/core]
> [cannot apply to tip/master linus/master tip/x86/core v5.11-rc6 next-20210125]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url: https://github.com/0day-ci/linux/commits/kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
> base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 32451614da2a9cf4296f90d3606ac77814fb519d
> config: x86_64-randconfig-s021-20210209 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> reproduce:
> # apt-get install sparse
> # sparse version: v0.6.3-215-g0fb77bb6-dirty
> # https://github.com/0day-ci/linux/commit/ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review kan-liang-linux-intel-com/Add-Alder-Lake-support-for-perf/20210209-070642
> git checkout ef3d3e5028f5f70a78fa37d642e8e7e65c60dee7
> # save the attached .config to linux build tree
> make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <[email protected]>
>
> All errors (new ones prefixed by >>):
>
> arch/x86/events/msr.c: In function 'test_intel':
>>> arch/x86/events/msr.c:104:7: error: 'INTEL_FAM6_ALDERLAKE_L' undeclared (first use in this function); did you mean 'INTEL_FAM6_ALDERLAKE'?
> 104 | case INTEL_FAM6_ALDERLAKE_L:
> | ^~~~~~~~~~~~~~~~~~~~~~
> | INTEL_FAM6_ALDERLAKE
> arch/x86/events/msr.c:104:7: note: each undeclared identifier is reported only once for each function it appears in


The patchset is on top of PeterZ's perf/core branch plus
commit id 6e1239c13953 ("x86/cpu: Add another Alder Lake CPU to the
Intel family")

The above patch is also missed in the tip/perf/core branch. All the
issues should be gone once the tip/perf/core sync with the tip/x86/urgent.


Thanks,
Kan


2021-02-11 12:00:02

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf

On Mon, Feb 08, 2021 at 07:24:57AM -0800, [email protected] wrote:

SNIP

> Jin Yao (24):
> perf jevents: Support unit value "cpu_core" and "cpu_atom"
> perf util: Save pmu name to struct perf_pmu_alias
> perf pmu: Save detected hybrid pmus to a global pmu list
> perf pmu: Add hybrid helper functions
> perf list: Support --cputype option to list hybrid pmu events
> perf stat: Hybrid evsel uses its own cpus
> perf header: Support HYBRID_TOPOLOGY feature
> perf header: Support hybrid CPU_PMU_CAPS
> tools headers uapi: Update tools's copy of linux/perf_event.h
> perf parse-events: Create two hybrid hardware events
> perf parse-events: Create two hybrid cache events
> perf parse-events: Support hardware events inside PMU
> perf list: Display pmu prefix for partially supported hybrid cache
> events
> perf parse-events: Support hybrid raw events
> perf stat: Support --cputype option for hybrid events
> perf stat: Support metrics with hybrid events
> perf evlist: Create two hybrid 'cycles' events by default
> perf stat: Add default hybrid events
> perf stat: Uniquify hybrid event name
> perf stat: Merge event counts from all hybrid PMUs
> perf stat: Filter out unmatched aggregation for hybrid event
> perf evlist: Warn as events from different hybrid PMUs in a group
> perf Documentation: Document intel-hybrid support
> perf evsel: Adjust hybrid event and global event mixed group
>
> Kan Liang (22):
> perf/x86/intel: Hybrid PMU support for perf capabilities
> perf/x86: Hybrid PMU support for intel_ctrl
> perf/x86: Hybrid PMU support for counters
> perf/x86: Hybrid PMU support for unconstrained
> perf/x86: Hybrid PMU support for hardware cache event
> perf/x86: Hybrid PMU support for event constraints
> perf/x86: Hybrid PMU support for extra_regs
> perf/x86/intel: Factor out intel_pmu_check_num_counters
> perf/x86/intel: Factor out intel_pmu_check_event_constraints
> perf/x86/intel: Factor out intel_pmu_check_extra_regs
> perf/x86: Expose check_hw_exists
> perf/x86: Remove temporary pmu assignment in event_init
> perf/x86: Factor out x86_pmu_show_pmu_cap
> perf/x86: Register hybrid PMUs
> perf/x86: Add structures for the attributes of Hybrid PMUs
> perf/x86/intel: Add attr_update for Hybrid PMUs
> perf/x86: Support filter_match callback
> perf/x86/intel: Add Alder Lake Hybrid support
> perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
> perf/x86/intel/uncore: Add Alder Lake support
> perf/x86/msr: Add Alder Lake CPU support
> perf/x86/cstate: Add Alder Lake CPU support
>
> Ricardo Neri (2):
> x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
> x86/cpu: Describe hybrid CPUs in cpuinfo_x86
>
> Zhang Rui (1):
> perf/x86/rapl: Add support for Intel Alder Lake

hi,
would you have git branch with all this somewhere?

thanks,
jirka

2021-02-11 17:24:18

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf



On 2/11/2021 6:40 AM, Jiri Olsa wrote:
> On Mon, Feb 08, 2021 at 07:24:57AM -0800, [email protected] wrote:
>
> SNIP
>
>> Jin Yao (24):
>> perf jevents: Support unit value "cpu_core" and "cpu_atom"
>> perf util: Save pmu name to struct perf_pmu_alias
>> perf pmu: Save detected hybrid pmus to a global pmu list
>> perf pmu: Add hybrid helper functions
>> perf list: Support --cputype option to list hybrid pmu events
>> perf stat: Hybrid evsel uses its own cpus
>> perf header: Support HYBRID_TOPOLOGY feature
>> perf header: Support hybrid CPU_PMU_CAPS
>> tools headers uapi: Update tools's copy of linux/perf_event.h
>> perf parse-events: Create two hybrid hardware events
>> perf parse-events: Create two hybrid cache events
>> perf parse-events: Support hardware events inside PMU
>> perf list: Display pmu prefix for partially supported hybrid cache
>> events
>> perf parse-events: Support hybrid raw events
>> perf stat: Support --cputype option for hybrid events
>> perf stat: Support metrics with hybrid events
>> perf evlist: Create two hybrid 'cycles' events by default
>> perf stat: Add default hybrid events
>> perf stat: Uniquify hybrid event name
>> perf stat: Merge event counts from all hybrid PMUs
>> perf stat: Filter out unmatched aggregation for hybrid event
>> perf evlist: Warn as events from different hybrid PMUs in a group
>> perf Documentation: Document intel-hybrid support
>> perf evsel: Adjust hybrid event and global event mixed group
>>
>> Kan Liang (22):
>> perf/x86/intel: Hybrid PMU support for perf capabilities
>> perf/x86: Hybrid PMU support for intel_ctrl
>> perf/x86: Hybrid PMU support for counters
>> perf/x86: Hybrid PMU support for unconstrained
>> perf/x86: Hybrid PMU support for hardware cache event
>> perf/x86: Hybrid PMU support for event constraints
>> perf/x86: Hybrid PMU support for extra_regs
>> perf/x86/intel: Factor out intel_pmu_check_num_counters
>> perf/x86/intel: Factor out intel_pmu_check_event_constraints
>> perf/x86/intel: Factor out intel_pmu_check_extra_regs
>> perf/x86: Expose check_hw_exists
>> perf/x86: Remove temporary pmu assignment in event_init
>> perf/x86: Factor out x86_pmu_show_pmu_cap
>> perf/x86: Register hybrid PMUs
>> perf/x86: Add structures for the attributes of Hybrid PMUs
>> perf/x86/intel: Add attr_update for Hybrid PMUs
>> perf/x86: Support filter_match callback
>> perf/x86/intel: Add Alder Lake Hybrid support
>> perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
>> perf/x86/intel/uncore: Add Alder Lake support
>> perf/x86/msr: Add Alder Lake CPU support
>> perf/x86/cstate: Add Alder Lake CPU support
>>
>> Ricardo Neri (2):
>> x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
>> x86/cpu: Describe hybrid CPUs in cpuinfo_x86
>>
>> Zhang Rui (1):
>> perf/x86/rapl: Add support for Intel Alder Lake
>
> hi,
> would you have git branch with all this somewhere?
>

Here is the git branch

https://github.com/kliang2/perf.git adl_enabling

Please note that the branch is on top of Peter's perf/core branch, which
doesn't include the latest perf tool changes. The perf tool patches in
the branch only includes the critical changes. There will be more tool
patches later, e.g., patches for perf mem, perf test etc.

Thanks,
Kan

2021-02-18 00:11:33

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf



On 2/12/2021 12:22 AM, Liang, Kan wrote:
>
>
> On 2/11/2021 6:40 AM, Jiri Olsa wrote:
>> On Mon, Feb 08, 2021 at 07:24:57AM -0800, [email protected] wrote:
>>
>> SNIP
>>
>>> Jin Yao (24):
>>>    perf jevents: Support unit value "cpu_core" and "cpu_atom"
>>>    perf util: Save pmu name to struct perf_pmu_alias
>>>    perf pmu: Save detected hybrid pmus to a global pmu list
>>>    perf pmu: Add hybrid helper functions
>>>    perf list: Support --cputype option to list hybrid pmu events
>>>    perf stat: Hybrid evsel uses its own cpus
>>>    perf header: Support HYBRID_TOPOLOGY feature
>>>    perf header: Support hybrid CPU_PMU_CAPS
>>>    tools headers uapi: Update tools's copy of linux/perf_event.h
>>>    perf parse-events: Create two hybrid hardware events
>>>    perf parse-events: Create two hybrid cache events
>>>    perf parse-events: Support hardware events inside PMU
>>>    perf list: Display pmu prefix for partially supported hybrid cache
>>>      events
>>>    perf parse-events: Support hybrid raw events
>>>    perf stat: Support --cputype option for hybrid events
>>>    perf stat: Support metrics with hybrid events
>>>    perf evlist: Create two hybrid 'cycles' events by default
>>>    perf stat: Add default hybrid events
>>>    perf stat: Uniquify hybrid event name
>>>    perf stat: Merge event counts from all hybrid PMUs
>>>    perf stat: Filter out unmatched aggregation for hybrid event
>>>    perf evlist: Warn as events from different hybrid PMUs in a group
>>>    perf Documentation: Document intel-hybrid support
>>>    perf evsel: Adjust hybrid event and global event mixed group
>>>
>>> Kan Liang (22):
>>>    perf/x86/intel: Hybrid PMU support for perf capabilities
>>>    perf/x86: Hybrid PMU support for intel_ctrl
>>>    perf/x86: Hybrid PMU support for counters
>>>    perf/x86: Hybrid PMU support for unconstrained
>>>    perf/x86: Hybrid PMU support for hardware cache event
>>>    perf/x86: Hybrid PMU support for event constraints
>>>    perf/x86: Hybrid PMU support for extra_regs
>>>    perf/x86/intel: Factor out intel_pmu_check_num_counters
>>>    perf/x86/intel: Factor out intel_pmu_check_event_constraints
>>>    perf/x86/intel: Factor out intel_pmu_check_extra_regs
>>>    perf/x86: Expose check_hw_exists
>>>    perf/x86: Remove temporary pmu assignment in event_init
>>>    perf/x86: Factor out x86_pmu_show_pmu_cap
>>>    perf/x86: Register hybrid PMUs
>>>    perf/x86: Add structures for the attributes of Hybrid PMUs
>>>    perf/x86/intel: Add attr_update for Hybrid PMUs
>>>    perf/x86: Support filter_match callback
>>>    perf/x86/intel: Add Alder Lake Hybrid support
>>>    perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
>>>    perf/x86/intel/uncore: Add Alder Lake support
>>>    perf/x86/msr: Add Alder Lake CPU support
>>>    perf/x86/cstate: Add Alder Lake CPU support
>>>
>>> Ricardo Neri (2):
>>>    x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
>>>    x86/cpu: Describe hybrid CPUs in cpuinfo_x86
>>>
>>> Zhang Rui (1):
>>>    perf/x86/rapl: Add support for Intel Alder Lake
>>
>> hi,
>> would you have git branch with all this somewhere?
>>
>
> Here is the git branch
>
> https://github.com/kliang2/perf.git adl_enabling
>
> Please note that the branch is on top of Peter's perf/core branch, which doesn't include the latest
> perf tool changes. The perf tool patches in the branch only includes the critical changes. There
> will be more tool patches later, e.g., patches for perf mem, perf test etc.
>
> Thanks,
> Kan

Yes, there will be much more perf tool patches later, such as the supports for perf mem, perf c2c
and etc.

Since the AlderLake perf tool patches are too huge, I will separate the whole patches into several
series in v2 and then post.

Thanks
Jin Yao

2021-03-05 00:16:50

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf

Hi Peter,

Could you please take a look at the perf kernel patches (3-25)?

By now, we have got some comments regarding the generic hybrid feature
enumeration code and perf tool patches. I would appreciate it very much
if you could comment on the perf kernel patches.

Thanks,
Kan

On 2/8/2021 10:24 AM, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> (The V1 patchset is a complete patchset for the Alder Lake support on
> the Linux perf. It includes both kernel patches (1-25) and the user
> space patches (26-49). It tries to give the maintainers/reviewers an
> overall picture of the ADL enabling patches. The number of the patches
> are huge. Sorry for it. For future versions, the patchset will be
> divided into the kernel patch series and the userspace patch series.
> They can be reviewed separately.)
>
> Alder Lake uses a hybrid architecture utilizing Golden Cove cores
> and Gracemont cores. On such architectures, all CPUs support the same,
> homogeneous and symmetric, instruction set. Also, CPUID enumerate
> the same features for all CPUs. There may be model-specific differences,
> such as those addressed in this patchset.
>
> The first two patches enumerate the hybrid CPU feature bit and save the
> CPU type in a new field x86_cpu_type in struct cpuinfo_x86 for the
> following patches. They were posted previously[1] but not merged.
> Compared with the initial submission, they address the below two
> concerns[2][3],
> - Provide a good use case, PMU.
> - Clarify what Intel Hybrid Technology is and is not.
>
> The PMU capabilities for Golden Cove core and Gracemont core are not the
> same. The key differences include the number of counters, events, perf
> metrics feature, and PEBS-via-PT feature. A dedicated hybrid PMU has to
> be registered for each of them. However, the current perf X86 assumes
> that there is only one CPU PMU. To handle the hybrid PMUs, the patchset
> - Introduce a new struct x86_hybrid_pmu to save the unique capabilities
> from different PMUs. It's part of the global x86_pmu. The architecture
> capabilities, which are available for all PMUs, are still saved in
> the global x86_pmu. I once considered dynamically create dedicated
> x86_pmu and pmu for each hybrid PMU. If so, they have to be changed to
> pointers. Since they are used everywhere, the changes could be huge
> and complex. Also, most of the PMU capabilities are the same between
> hybrid PMUs. Duplicated data in the big x86_pmu structure will be
> saved many times. So the dynamic way was dropped.
> - The hybrid PMU registration has been moved to the cpu_starting(),
> because only boot CPU is available when invoking the
> init_hw_perf_events().
> - Hybrid PMUs have different events and formats. Add new structures and
> helpers for events attribute and format attribute which take the PMU
> type into account.
> - Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
> PERF_TYPE_HW_CACHE_PMU to facilitate user space tools
>
> The uncore, MSR and cstate are the same between hybrid CPUs.
> Don't need to register hybrid PMUs for them.
>
> The generic code kernel/events/core.c is not hybrid friendly either,
> especially for the per-task monitoring. Peter once proposed a
> patchset[4], but it hasn't been merged. This patchset doesn't intend to
> improve the generic code (which can be improved later separately). It
> still uses the capability PERF_PMU_CAP_HETEROGENEOUS_CPUS for each
> hybrid PMUs. For per-task and system-wide monitoring, user space tools
> have to create events on all available hybrid PMUs. The events which are
> from different hybrid PMUs cannot be included in the same group.
>
> [1]. https://lore.kernel.org/lkml/[email protected]/
> [2]. https://lore.kernel.org/lkml/[email protected]/
> [3]. https://lore.kernel.org/lkml/[email protected]/
> [4]. https://lkml.kernel.org/r/[email protected]/
>
> Jin Yao (24):
> perf jevents: Support unit value "cpu_core" and "cpu_atom"
> perf util: Save pmu name to struct perf_pmu_alias
> perf pmu: Save detected hybrid pmus to a global pmu list
> perf pmu: Add hybrid helper functions
> perf list: Support --cputype option to list hybrid pmu events
> perf stat: Hybrid evsel uses its own cpus
> perf header: Support HYBRID_TOPOLOGY feature
> perf header: Support hybrid CPU_PMU_CAPS
> tools headers uapi: Update tools's copy of linux/perf_event.h
> perf parse-events: Create two hybrid hardware events
> perf parse-events: Create two hybrid cache events
> perf parse-events: Support hardware events inside PMU
> perf list: Display pmu prefix for partially supported hybrid cache
> events
> perf parse-events: Support hybrid raw events
> perf stat: Support --cputype option for hybrid events
> perf stat: Support metrics with hybrid events
> perf evlist: Create two hybrid 'cycles' events by default
> perf stat: Add default hybrid events
> perf stat: Uniquify hybrid event name
> perf stat: Merge event counts from all hybrid PMUs
> perf stat: Filter out unmatched aggregation for hybrid event
> perf evlist: Warn as events from different hybrid PMUs in a group
> perf Documentation: Document intel-hybrid support
> perf evsel: Adjust hybrid event and global event mixed group
>
> Kan Liang (22):
> perf/x86/intel: Hybrid PMU support for perf capabilities
> perf/x86: Hybrid PMU support for intel_ctrl
> perf/x86: Hybrid PMU support for counters
> perf/x86: Hybrid PMU support for unconstrained
> perf/x86: Hybrid PMU support for hardware cache event
> perf/x86: Hybrid PMU support for event constraints
> perf/x86: Hybrid PMU support for extra_regs
> perf/x86/intel: Factor out intel_pmu_check_num_counters
> perf/x86/intel: Factor out intel_pmu_check_event_constraints
> perf/x86/intel: Factor out intel_pmu_check_extra_regs
> perf/x86: Expose check_hw_exists
> perf/x86: Remove temporary pmu assignment in event_init
> perf/x86: Factor out x86_pmu_show_pmu_cap
> perf/x86: Register hybrid PMUs
> perf/x86: Add structures for the attributes of Hybrid PMUs
> perf/x86/intel: Add attr_update for Hybrid PMUs
> perf/x86: Support filter_match callback
> perf/x86/intel: Add Alder Lake Hybrid support
> perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
> perf/x86/intel/uncore: Add Alder Lake support
> perf/x86/msr: Add Alder Lake CPU support
> perf/x86/cstate: Add Alder Lake CPU support
>
> Ricardo Neri (2):
> x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
> x86/cpu: Describe hybrid CPUs in cpuinfo_x86
>
> Zhang Rui (1):
> perf/x86/rapl: Add support for Intel Alder Lake
>
> arch/x86/events/core.c | 286 ++++++++++---
> arch/x86/events/intel/core.c | 685 ++++++++++++++++++++++++++----
> arch/x86/events/intel/cstate.c | 39 +-
> arch/x86/events/intel/ds.c | 28 +-
> arch/x86/events/intel/uncore.c | 7 +
> arch/x86/events/intel/uncore.h | 1 +
> arch/x86/events/intel/uncore_snb.c | 131 ++++++
> arch/x86/events/msr.c | 2 +
> arch/x86/events/perf_event.h | 117 ++++-
> arch/x86/events/rapl.c | 2 +
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/msr-index.h | 2 +
> arch/x86/include/asm/processor.h | 13 +
> arch/x86/kernel/cpu/common.c | 3 +
> include/linux/perf_event.h | 12 +
> include/uapi/linux/perf_event.h | 26 ++
> kernel/events/core.c | 14 +-
> tools/include/uapi/linux/perf_event.h | 26 ++
> tools/perf/Documentation/intel-hybrid.txt | 335 +++++++++++++++
> tools/perf/Documentation/perf-list.txt | 4 +
> tools/perf/Documentation/perf-record.txt | 1 +
> tools/perf/Documentation/perf-stat.txt | 13 +
> tools/perf/builtin-list.c | 42 +-
> tools/perf/builtin-record.c | 3 +
> tools/perf/builtin-stat.c | 94 +++-
> tools/perf/pmu-events/jevents.c | 2 +
> tools/perf/util/cputopo.c | 80 ++++
> tools/perf/util/cputopo.h | 13 +
> tools/perf/util/env.c | 12 +
> tools/perf/util/env.h | 18 +-
> tools/perf/util/evlist.c | 148 ++++++-
> tools/perf/util/evlist.h | 7 +
> tools/perf/util/evsel.c | 111 ++++-
> tools/perf/util/evsel.h | 10 +-
> tools/perf/util/header.c | 267 +++++++++++-
> tools/perf/util/header.h | 1 +
> tools/perf/util/metricgroup.c | 226 +++++++++-
> tools/perf/util/metricgroup.h | 2 +-
> tools/perf/util/parse-events.c | 405 +++++++++++++++++-
> tools/perf/util/parse-events.h | 10 +-
> tools/perf/util/parse-events.y | 21 +-
> tools/perf/util/pmu.c | 120 +++++-
> tools/perf/util/pmu.h | 24 +-
> tools/perf/util/stat-display.c | 28 +-
> tools/perf/util/stat.h | 2 +
> 45 files changed, 3106 insertions(+), 288 deletions(-)
> create mode 100644 tools/perf/Documentation/intel-hybrid.txt
>

2021-03-05 00:54:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf

On Thu, Mar 04, 2021 at 10:50:45AM -0500, Liang, Kan wrote:
> Hi Peter,
>
> Could you please take a look at the perf kernel patches (3-25)?
>
> By now, we have got some comments regarding the generic hybrid feature
> enumeration code and perf tool patches. I would appreciate it very much if
> you could comment on the perf kernel patches.
>

Yeah, I started staring at it yesterday.. I'll try and respond tomorrow.

2021-03-05 11:18:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf

On Thu, Mar 04, 2021 at 06:50:00PM +0100, Peter Zijlstra wrote:
> On Thu, Mar 04, 2021 at 10:50:45AM -0500, Liang, Kan wrote:
> > Hi Peter,
> >
> > Could you please take a look at the perf kernel patches (3-25)?
> >
> > By now, we have got some comments regarding the generic hybrid feature
> > enumeration code and perf tool patches. I would appreciate it very much if
> > you could comment on the perf kernel patches.
> >
>
> Yeah, I started staring at it yesterday.. I'll try and respond tomorrow.

OK, so STYLE IS HORRIBLE, please surrender your CAPS-LOCK key until
further notice.

The below is a completely unfinished 'diff' against 3-20. It has various
notes interspersed.

Please rework.

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -44,11 +44,13 @@

#include "perf_event.h"

+static struct pmu pmu;
+
struct x86_pmu x86_pmu __read_mostly;

DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = {
.enabled = 1,
- .hybrid_pmu_idx = X86_NON_HYBRID_PMU,
+ .pmu = &pmu;
};

DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key);
@@ -184,7 +186,7 @@ static inline int get_possible_num_count
{
int bit, num_counters = 0;

- if (!IS_X86_HYBRID)
+ if (!is_hybrid())
return x86_pmu.num_counters;

for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX)
@@ -270,6 +272,9 @@ bool check_hw_exists(int num_counters, i
if (ret)
goto msr_fail;
for (i = 0; i < num_counters_fixed; i++) {
+ /*
+ * XXX comment that explains why/how NULL below works.
+ */
if (fixed_counter_disabled(i, NULL))
continue;
if (val & (0x03 << i*4)) {
@@ -352,7 +357,6 @@ set_ext_hw_attr(struct hw_perf_event *hw
{
struct perf_event_attr *attr = &event->attr;
unsigned int cache_type, cache_op, cache_result;
- struct x86_hybrid_pmu *pmu = IS_X86_HYBRID ? container_of(event->pmu, struct x86_hybrid_pmu, pmu) : NULL;
u64 config, val;

config = attr->config;
@@ -372,10 +376,7 @@ set_ext_hw_attr(struct hw_perf_event *hw
return -EINVAL;
cache_result = array_index_nospec(cache_result, PERF_COUNT_HW_CACHE_RESULT_MAX);

- if (pmu)
- val = pmu->hw_cache_event_ids[cache_type][cache_op][cache_result];
- else
- val = hw_cache_event_ids[cache_type][cache_op][cache_result];
+ val = hybrid(event->pmu, hw_cache_event_ids)[cache_type][cache_op][cache_result];

if (val == 0)
return -ENOENT;
@@ -384,10 +385,7 @@ set_ext_hw_attr(struct hw_perf_event *hw
return -EINVAL;

hwc->config |= val;
- if (pmu)
- attr->config1 = pmu->hw_cache_extra_regs[cache_type][cache_op][cache_result];
- else
- attr->config1 = hw_cache_extra_regs[cache_type][cache_op][cache_result];
+ attr->config1 = hybrid(event->pmu, hw_cache_extra_regs)[cache_type][cache_op][cache_result];
return x86_pmu_extra_regs(val, event);
}

@@ -742,13 +740,11 @@ void x86_pmu_enable_all(int added)
}
}

-static struct pmu pmu;
-
static inline int is_x86_event(struct perf_event *event)
{
int bit;

- if (!IS_X86_HYBRID)
+ if (!is_hybrid())
return event->pmu == &pmu;

for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX) {
@@ -760,6 +756,7 @@ static inline int is_x86_event(struct pe

struct pmu *x86_get_pmu(void)
{
+ /* borken */
return &pmu;
}
/*
@@ -963,7 +960,7 @@ EXPORT_SYMBOL_GPL(perf_assign_events);

int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
{
- int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
+ int num_counters = hybrid(cpuc->pmu, num_counters);
struct event_constraint *c;
struct perf_event *e;
int n0, i, wmin, wmax, unsched = 0;
@@ -1124,7 +1121,7 @@ static void del_nr_metric_event(struct c
static int collect_event(struct cpu_hw_events *cpuc, struct perf_event *event,
int max_count, int n)
{
- union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);
+ union perf_capabilities intel_cap = hybrid(cpuc->pmu, intel_cap);

if (intel_cap.perf_metrics && add_nr_metric_event(cpuc, event))
return -EINVAL;
@@ -1147,8 +1144,8 @@ static int collect_event(struct cpu_hw_e
*/
static int collect_events(struct cpu_hw_events *cpuc, struct perf_event *leader, bool dogrp)
{
- int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
- int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
+ int num_counters = hybrid(cpuc->pmu, num_counters);
+ int num_counters_fixed = hybrid(cpuc->pmu, num_counters_fixed);
struct perf_event *event;
int n, max_count;

@@ -1522,9 +1519,9 @@ void perf_event_print_debug(void)
u64 pebs, debugctl;
int cpu = smp_processor_id();
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
- int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
- int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
- struct event_constraint *pebs_constraints = X86_HYBRID_READ_FROM_CPUC(pebs_constraints, cpuc);
+ int num_counters = hybrid(cpuc->pmu, num_counters);
+ int num_counters_fixed = hybrid(cpuc->pmu, num_counters_fixed);
+ struct event_constraint *pebs_constraints = hybrid(cpuc->pmu, pebs_constraints);
unsigned long flags;
int idx;

@@ -1605,7 +1602,7 @@ void x86_pmu_stop(struct perf_event *eve
static void x86_pmu_del(struct perf_event *event, int flags)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
- union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);
+ union perf_capabilities intel_cap = hybrid(cpuc->pmu, intel_cap);
int i;

/*
@@ -2105,7 +2102,7 @@ static int __init init_hw_perf_events(vo

pmu.attr_update = x86_pmu.attr_update;

- if (!IS_X86_HYBRID) {
+ if (!is_hybrid()) {
x86_pmu_show_pmu_cap(x86_pmu.num_counters,
x86_pmu.num_counters_fixed,
x86_pmu.intel_ctrl);
@@ -2139,7 +2136,7 @@ static int __init init_hw_perf_events(vo
if (err)
goto out1;

- if (!IS_X86_HYBRID) {
+ if (!is_hybrid()) {
err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
if (err)
goto out2;
@@ -2303,7 +2300,11 @@ static struct cpu_hw_events *allocate_fa
return ERR_PTR(-ENOMEM);
cpuc->is_fake = 1;

- if (IS_X86_HYBRID)
+ /*
+ * Utterly broken, this selects a random pmu to validate on;
+ * it should match event->pmu.
+ */
+ if (is_hybrid())
cpuc->hybrid_pmu_idx = x86_hybrid_get_idx_from_cpu(cpu);
else
cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
@@ -2362,8 +2363,10 @@ static int validate_group(struct perf_ev

/*
* Reject events from different hybrid PMUs.
+ *
+ * This is just flat out buggered.
*/
- if (IS_X86_HYBRID) {
+ if (is_hybrid()) {
struct perf_event *sibling;
struct pmu *pmu = NULL;

@@ -2380,6 +2383,26 @@ static int validate_group(struct perf_ev

if (pmu && pmu != event->pmu)
return ret;
+
+ /*
+ * Maybe something like so..
+ */
+
+ struct perf_event *sibling;
+ struct pmu *pmu = NULL;
+
+ if (is_x86_event(leader))
+ pmu = leader->pmu;
+
+ for_each_sibling_event(sibling, leader) {
+ if (!is_x86_event(sibling))
+ continue;
+
+ if (!pmu)
+ pmu = sibling->pmu;
+ else if (pmu != sibling->pmu)
+ return ret;
+ }
}

fake_cpuc = allocate_fake_cpuc();
@@ -2418,7 +2441,7 @@ static int x86_pmu_event_init(struct per
(event->attr.type != PERF_TYPE_HW_CACHE))
return -ENOENT;

- if (IS_X86_HYBRID && (event->cpu != -1)) {
+ if (is_hybrid() && (event->cpu != -1)) {
hybrid_pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
if (!cpumask_test_cpu(event->cpu, &hybrid_pmu->supported_cpus))
return -ENOENT;
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3168,7 +3168,7 @@ struct event_constraint *
x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
struct perf_event *event)
{
- struct event_constraint *event_constraints = X86_HYBRID_READ_FROM_CPUC(event_constraints, cpuc);
+ struct event_constraint *event_constraints = hybrid(cpuc->pmu, event_constraints);
struct event_constraint *c;

if (event_constraints) {
@@ -3180,10 +3180,10 @@ x86_get_event_constraints(struct cpu_hw_
}
}

- if (!HAS_VALID_HYBRID_PMU_IN_CPUC(cpuc))
+ if (!is_hybrid() || !cpuc->pmu)
return &unconstrained;

- return &x86_pmu.hybrid_pmu[cpuc->hybrid_pmu_idx].unconstrained;
+ return hybrid_pmu(cpuc->pmu)->unconstrained;
}

static struct event_constraint *
@@ -3691,7 +3691,7 @@ static inline bool intel_pmu_has_cap(str
{
struct x86_hybrid_pmu *pmu;

- if (!IS_X86_HYBRID)
+ if (!is_hybrid())
return test_bit(idx, (unsigned long *)&x86_pmu.intel_cap.capabilities);

pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
@@ -4224,7 +4224,7 @@ int intel_cpuc_prepare(struct cpu_hw_eve
{
cpuc->pebs_record_size = x86_pmu.pebs_record_size;

- if (IS_X86_HYBRID || x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
+ if (is_hybrid() || x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
cpuc->shared_regs = allocate_shared_regs(cpu);
if (!cpuc->shared_regs)
goto err;
@@ -4377,8 +4377,8 @@ static void init_hybrid_pmu(int cpu)
if (!test_bit(idx, &x86_pmu.hybrid_pmu_bitmap))
return;

- cpuc->hybrid_pmu_idx = idx;
pmu = &x86_pmu.hybrid_pmu[idx];
+ cpuc->pmu = &pmu.pmu;

/* Only register PMU for the first CPU */
if (!cpumask_empty(&pmu->supported_cpus)) {
@@ -4451,7 +4451,7 @@ static void intel_pmu_cpu_starting(int c
int core_id = topology_core_id(cpu);
int i;

- if (IS_X86_HYBRID)
+ if (is_hybrid())
init_hybrid_pmu(cpu);

init_debug_store_on_cpu(cpu);
@@ -4480,7 +4480,7 @@ static void intel_pmu_cpu_starting(int c
* feature for now. The corresponding bit should always be 0 on
* a hybrid platform, e.g., Alder Lake.
*/
- if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics) {
+ if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics) {
union perf_capabilities perf_cap;

rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
@@ -4569,7 +4569,7 @@ static void intel_pmu_cpu_dead(int cpu)
{
intel_cpuc_finish(&per_cpu(cpu_hw_events, cpu));

- if (IS_X86_HYBRID) {
+ if (is_hybrid()) {
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
int idx = x86_hybrid_get_idx_from_cpu(cpu);
struct x86_hybrid_pmu *hybrid_pmu;
@@ -4579,7 +4579,7 @@ static void intel_pmu_cpu_dead(int cpu)

hybrid_pmu = &x86_pmu.hybrid_pmu[idx];
cpumask_clear_cpu(cpu, &hybrid_pmu->supported_cpus);
- cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
+ cpuc->pmu = NULL;
if (cpumask_empty(&hybrid_pmu->supported_cpus)) {
perf_pmu_unregister(&hybrid_pmu->pmu);
hybrid_pmu->pmu.type = -1;
@@ -6217,7 +6217,7 @@ __init int intel_pmu_init(void)

snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);

- if (!IS_X86_HYBRID) {
+ if (!is_hybrid()) {
group_events_td.attrs = td_attr;
group_events_mem.attrs = mem_attr;
group_events_tsx.attrs = tsx_attr;
@@ -6273,7 +6273,7 @@ __init int intel_pmu_init(void)
pr_cont("full-width counters, ");
}

- if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics)
+ if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;

return 0;
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -2217,7 +2217,7 @@ void __init intel_ds_init(void)
}
pr_cont("PEBS fmt4%c%s, ", pebs_type, pebs_qual);

- if (!IS_X86_HYBRID && x86_pmu.intel_cap.pebs_output_pt_available) {
+ if (!is_hybrid() && x86_pmu.intel_cap.pebs_output_pt_available) {
pr_cont("PEBS-via-PT, ");
x86_get_pmu()->capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
}
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -331,7 +331,7 @@ struct cpu_hw_events {
/*
* Hybrid PMU support
*/
- int hybrid_pmu_idx;
+ struct pmu *pmu;
};

#define __EVENT_CONSTRAINT_RANGE(c, e, n, m, w, o, f) { \
@@ -671,21 +671,30 @@ struct x86_hybrid_pmu {
struct extra_reg *extra_regs;
};

-#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
-
-#define HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) \
- (IS_X86_HYBRID && \
- ((_cpuc)->hybrid_pmu_idx >= X86_HYBRID_PMU_ATOM_IDX) && \
- ((_cpuc)->hybrid_pmu_idx < X86_HYBRID_PMU_MAX_INDEX))
+static __always_inline bool is_hybrid(void)
+{
+ return unlikely(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU));
+}

-#define X86_HYBRID_READ_FROM_CPUC(_name, _cpuc) \
- (_cpuc && HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) ? x86_pmu.hybrid_pmu[(_cpuc)->hybrid_pmu_idx]._name : x86_pmu._name)
+static __always_inline bool is_hybrid_idx(int idx)
+{
+ return (unsigned)idx < X86_HYBRID_PMU_MAX_INDEX;
+}

-#define X86_HYBRID_READ_FROM_EVENT(_name, _event) \
- (IS_X86_HYBRID ? ((struct x86_hybrid_pmu *)(_event->pmu))->_name : x86_pmu._name)
+static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *)
+{
+ return container_of(pmu, struct x86_hybrid_pmu, pmu);
+}

-#define IS_VALID_HYBRID_PMU_IDX(idx) \
- (idx < X86_HYBRID_PMU_MAX_INDEX && idx > X86_NON_HYBRID_PMU)
+#define hybrid(_pmu, _field) \
+({ \
+ typeof(x86_pmu._field) __F = x86_pmu._field; \
+ \
+ if (is_hybrid() && (_pmu)) \
+ __F = hybrid_pmu(_pmu)->_field; \
+ \
+ __F; \
+})

static inline enum x86_hybrid_pmu_type_idx
x86_hybrid_get_idx_from_cpu(unsigned int cpu)
@@ -898,9 +907,12 @@ struct x86_pmu {
* for all PMUs. The hybrid_pmu only includes the unique capabilities.
* The hybrid_pmu_bitmap is the bits map of the possible hybrid_pmu.
*/
+ int (*filter_match)(struct perf_event *event);
unsigned long hybrid_pmu_bitmap;
+ /*
+ * This thing is huge, use dynamic allocation!
+ */
struct x86_hybrid_pmu hybrid_pmu[X86_HYBRID_PMU_MAX_INDEX];
- int (*filter_match)(struct perf_event *event);
};

struct x86_perf_task_context_opt {

2021-03-05 13:38:35

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH 00/49] Add Alder Lake support for perf



On 3/5/2021 6:14 AM, Peter Zijlstra wrote:
> On Thu, Mar 04, 2021 at 06:50:00PM +0100, Peter Zijlstra wrote:
>> On Thu, Mar 04, 2021 at 10:50:45AM -0500, Liang, Kan wrote:
>>> Hi Peter,
>>>
>>> Could you please take a look at the perf kernel patches (3-25)?
>>>
>>> By now, we have got some comments regarding the generic hybrid feature
>>> enumeration code and perf tool patches. I would appreciate it very much if
>>> you could comment on the perf kernel patches.
>>>
>>
>> Yeah, I started staring at it yesterday.. I'll try and respond tomorrow.
>
> OK, so STYLE IS HORRIBLE, please surrender your CAPS-LOCK key until
> further notice.
>
> The below is a completely unfinished 'diff' against 3-20. It has various
> notes interspersed.
>
> Please rework.


Thanks for the detailed comments. I will rework the code accordingly.

Thanks,
Kan

>
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -44,11 +44,13 @@
>
> #include "perf_event.h"
>
> +static struct pmu pmu;
> +
> struct x86_pmu x86_pmu __read_mostly;
>
> DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = {
> .enabled = 1,
> - .hybrid_pmu_idx = X86_NON_HYBRID_PMU,
> + .pmu = &pmu;
> };
>
> DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key);
> @@ -184,7 +186,7 @@ static inline int get_possible_num_count
> {
> int bit, num_counters = 0;
>
> - if (!IS_X86_HYBRID)
> + if (!is_hybrid())
> return x86_pmu.num_counters;
>
> for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX)
> @@ -270,6 +272,9 @@ bool check_hw_exists(int num_counters, i
> if (ret)
> goto msr_fail;
> for (i = 0; i < num_counters_fixed; i++) {
> + /*
> + * XXX comment that explains why/how NULL below works.
> + */
> if (fixed_counter_disabled(i, NULL))
> continue;
> if (val & (0x03 << i*4)) {
> @@ -352,7 +357,6 @@ set_ext_hw_attr(struct hw_perf_event *hw
> {
> struct perf_event_attr *attr = &event->attr;
> unsigned int cache_type, cache_op, cache_result;
> - struct x86_hybrid_pmu *pmu = IS_X86_HYBRID ? container_of(event->pmu, struct x86_hybrid_pmu, pmu) : NULL;
> u64 config, val;
>
> config = attr->config;
> @@ -372,10 +376,7 @@ set_ext_hw_attr(struct hw_perf_event *hw
> return -EINVAL;
> cache_result = array_index_nospec(cache_result, PERF_COUNT_HW_CACHE_RESULT_MAX);
>
> - if (pmu)
> - val = pmu->hw_cache_event_ids[cache_type][cache_op][cache_result];
> - else
> - val = hw_cache_event_ids[cache_type][cache_op][cache_result];
> + val = hybrid(event->pmu, hw_cache_event_ids)[cache_type][cache_op][cache_result];
>
> if (val == 0)
> return -ENOENT;
> @@ -384,10 +385,7 @@ set_ext_hw_attr(struct hw_perf_event *hw
> return -EINVAL;
>
> hwc->config |= val;
> - if (pmu)
> - attr->config1 = pmu->hw_cache_extra_regs[cache_type][cache_op][cache_result];
> - else
> - attr->config1 = hw_cache_extra_regs[cache_type][cache_op][cache_result];
> + attr->config1 = hybrid(event->pmu, hw_cache_extra_regs)[cache_type][cache_op][cache_result];
> return x86_pmu_extra_regs(val, event);
> }
>
> @@ -742,13 +740,11 @@ void x86_pmu_enable_all(int added)
> }
> }
>
> -static struct pmu pmu;
> -
> static inline int is_x86_event(struct perf_event *event)
> {
> int bit;
>
> - if (!IS_X86_HYBRID)
> + if (!is_hybrid())
> return event->pmu == &pmu;
>
> for_each_set_bit(bit, &x86_pmu.hybrid_pmu_bitmap, X86_HYBRID_PMU_MAX_INDEX) {
> @@ -760,6 +756,7 @@ static inline int is_x86_event(struct pe
>
> struct pmu *x86_get_pmu(void)
> {
> + /* borken */
> return &pmu;
> }
> /*
> @@ -963,7 +960,7 @@ EXPORT_SYMBOL_GPL(perf_assign_events);
>
> int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
> {
> - int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
> + int num_counters = hybrid(cpuc->pmu, num_counters);
> struct event_constraint *c;
> struct perf_event *e;
> int n0, i, wmin, wmax, unsched = 0;
> @@ -1124,7 +1121,7 @@ static void del_nr_metric_event(struct c
> static int collect_event(struct cpu_hw_events *cpuc, struct perf_event *event,
> int max_count, int n)
> {
> - union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);
> + union perf_capabilities intel_cap = hybrid(cpuc->pmu, intel_cap);
>
> if (intel_cap.perf_metrics && add_nr_metric_event(cpuc, event))
> return -EINVAL;
> @@ -1147,8 +1144,8 @@ static int collect_event(struct cpu_hw_e
> */
> static int collect_events(struct cpu_hw_events *cpuc, struct perf_event *leader, bool dogrp)
> {
> - int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
> - int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
> + int num_counters = hybrid(cpuc->pmu, num_counters);
> + int num_counters_fixed = hybrid(cpuc->pmu, num_counters_fixed);
> struct perf_event *event;
> int n, max_count;
>
> @@ -1522,9 +1519,9 @@ void perf_event_print_debug(void)
> u64 pebs, debugctl;
> int cpu = smp_processor_id();
> struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
> - int num_counters = X86_HYBRID_READ_FROM_CPUC(num_counters, cpuc);
> - int num_counters_fixed = X86_HYBRID_READ_FROM_CPUC(num_counters_fixed, cpuc);
> - struct event_constraint *pebs_constraints = X86_HYBRID_READ_FROM_CPUC(pebs_constraints, cpuc);
> + int num_counters = hybrid(cpuc->pmu, num_counters);
> + int num_counters_fixed = hybrid(cpuc->pmu, num_counters_fixed);
> + struct event_constraint *pebs_constraints = hybrid(cpuc->pmu, pebs_constraints);
> unsigned long flags;
> int idx;
>
> @@ -1605,7 +1602,7 @@ void x86_pmu_stop(struct perf_event *eve
> static void x86_pmu_del(struct perf_event *event, int flags)
> {
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> - union perf_capabilities intel_cap = X86_HYBRID_READ_FROM_CPUC(intel_cap, cpuc);
> + union perf_capabilities intel_cap = hybrid(cpuc->pmu, intel_cap);
> int i;
>
> /*
> @@ -2105,7 +2102,7 @@ static int __init init_hw_perf_events(vo
>
> pmu.attr_update = x86_pmu.attr_update;
>
> - if (!IS_X86_HYBRID) {
> + if (!is_hybrid()) {
> x86_pmu_show_pmu_cap(x86_pmu.num_counters,
> x86_pmu.num_counters_fixed,
> x86_pmu.intel_ctrl);
> @@ -2139,7 +2136,7 @@ static int __init init_hw_perf_events(vo
> if (err)
> goto out1;
>
> - if (!IS_X86_HYBRID) {
> + if (!is_hybrid()) {
> err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> if (err)
> goto out2;
> @@ -2303,7 +2300,11 @@ static struct cpu_hw_events *allocate_fa
> return ERR_PTR(-ENOMEM);
> cpuc->is_fake = 1;
>
> - if (IS_X86_HYBRID)
> + /*
> + * Utterly broken, this selects a random pmu to validate on;
> + * it should match event->pmu.
> + */
> + if (is_hybrid())
> cpuc->hybrid_pmu_idx = x86_hybrid_get_idx_from_cpu(cpu);
> else
> cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
> @@ -2362,8 +2363,10 @@ static int validate_group(struct perf_ev
>
> /*
> * Reject events from different hybrid PMUs.
> + *
> + * This is just flat out buggered.
> */
> - if (IS_X86_HYBRID) {
> + if (is_hybrid()) {
> struct perf_event *sibling;
> struct pmu *pmu = NULL;
>
> @@ -2380,6 +2383,26 @@ static int validate_group(struct perf_ev
>
> if (pmu && pmu != event->pmu)
> return ret;
> +
> + /*
> + * Maybe something like so..
> + */
> +
> + struct perf_event *sibling;
> + struct pmu *pmu = NULL;
> +
> + if (is_x86_event(leader))
> + pmu = leader->pmu;
> +
> + for_each_sibling_event(sibling, leader) {
> + if (!is_x86_event(sibling))
> + continue;
> +
> + if (!pmu)
> + pmu = sibling->pmu;
> + else if (pmu != sibling->pmu)
> + return ret;
> + }
> }
>
> fake_cpuc = allocate_fake_cpuc();
> @@ -2418,7 +2441,7 @@ static int x86_pmu_event_init(struct per
> (event->attr.type != PERF_TYPE_HW_CACHE))
> return -ENOENT;
>
> - if (IS_X86_HYBRID && (event->cpu != -1)) {
> + if (is_hybrid() && (event->cpu != -1)) {
> hybrid_pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
> if (!cpumask_test_cpu(event->cpu, &hybrid_pmu->supported_cpus))
> return -ENOENT;
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -3168,7 +3168,7 @@ struct event_constraint *
> x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> struct perf_event *event)
> {
> - struct event_constraint *event_constraints = X86_HYBRID_READ_FROM_CPUC(event_constraints, cpuc);
> + struct event_constraint *event_constraints = hybrid(cpuc->pmu, event_constraints);
> struct event_constraint *c;
>
> if (event_constraints) {
> @@ -3180,10 +3180,10 @@ x86_get_event_constraints(struct cpu_hw_
> }
> }
>
> - if (!HAS_VALID_HYBRID_PMU_IN_CPUC(cpuc))
> + if (!is_hybrid() || !cpuc->pmu)
> return &unconstrained;
>
> - return &x86_pmu.hybrid_pmu[cpuc->hybrid_pmu_idx].unconstrained;
> + return hybrid_pmu(cpuc->pmu)->unconstrained;
> }
>
> static struct event_constraint *
> @@ -3691,7 +3691,7 @@ static inline bool intel_pmu_has_cap(str
> {
> struct x86_hybrid_pmu *pmu;
>
> - if (!IS_X86_HYBRID)
> + if (!is_hybrid())
> return test_bit(idx, (unsigned long *)&x86_pmu.intel_cap.capabilities);
>
> pmu = container_of(event->pmu, struct x86_hybrid_pmu, pmu);
> @@ -4224,7 +4224,7 @@ int intel_cpuc_prepare(struct cpu_hw_eve
> {
> cpuc->pebs_record_size = x86_pmu.pebs_record_size;
>
> - if (IS_X86_HYBRID || x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
> + if (is_hybrid() || x86_pmu.extra_regs || x86_pmu.lbr_sel_map) {
> cpuc->shared_regs = allocate_shared_regs(cpu);
> if (!cpuc->shared_regs)
> goto err;
> @@ -4377,8 +4377,8 @@ static void init_hybrid_pmu(int cpu)
> if (!test_bit(idx, &x86_pmu.hybrid_pmu_bitmap))
> return;
>
> - cpuc->hybrid_pmu_idx = idx;
> pmu = &x86_pmu.hybrid_pmu[idx];
> + cpuc->pmu = &pmu.pmu;
>
> /* Only register PMU for the first CPU */
> if (!cpumask_empty(&pmu->supported_cpus)) {
> @@ -4451,7 +4451,7 @@ static void intel_pmu_cpu_starting(int c
> int core_id = topology_core_id(cpu);
> int i;
>
> - if (IS_X86_HYBRID)
> + if (is_hybrid())
> init_hybrid_pmu(cpu);
>
> init_debug_store_on_cpu(cpu);
> @@ -4480,7 +4480,7 @@ static void intel_pmu_cpu_starting(int c
> * feature for now. The corresponding bit should always be 0 on
> * a hybrid platform, e.g., Alder Lake.
> */
> - if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics) {
> + if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics) {
> union perf_capabilities perf_cap;
>
> rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap.capabilities);
> @@ -4569,7 +4569,7 @@ static void intel_pmu_cpu_dead(int cpu)
> {
> intel_cpuc_finish(&per_cpu(cpu_hw_events, cpu));
>
> - if (IS_X86_HYBRID) {
> + if (is_hybrid()) {
> struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
> int idx = x86_hybrid_get_idx_from_cpu(cpu);
> struct x86_hybrid_pmu *hybrid_pmu;
> @@ -4579,7 +4579,7 @@ static void intel_pmu_cpu_dead(int cpu)
>
> hybrid_pmu = &x86_pmu.hybrid_pmu[idx];
> cpumask_clear_cpu(cpu, &hybrid_pmu->supported_cpus);
> - cpuc->hybrid_pmu_idx = X86_NON_HYBRID_PMU;
> + cpuc->pmu = NULL;
> if (cpumask_empty(&hybrid_pmu->supported_cpus)) {
> perf_pmu_unregister(&hybrid_pmu->pmu);
> hybrid_pmu->pmu.type = -1;
> @@ -6217,7 +6217,7 @@ __init int intel_pmu_init(void)
>
> snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);
>
> - if (!IS_X86_HYBRID) {
> + if (!is_hybrid()) {
> group_events_td.attrs = td_attr;
> group_events_mem.attrs = mem_attr;
> group_events_tsx.attrs = tsx_attr;
> @@ -6273,7 +6273,7 @@ __init int intel_pmu_init(void)
> pr_cont("full-width counters, ");
> }
>
> - if (!IS_X86_HYBRID && x86_pmu.intel_cap.perf_metrics)
> + if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
> x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
>
> return 0;
> --- a/arch/x86/events/intel/ds.c
> +++ b/arch/x86/events/intel/ds.c
> @@ -2217,7 +2217,7 @@ void __init intel_ds_init(void)
> }
> pr_cont("PEBS fmt4%c%s, ", pebs_type, pebs_qual);
>
> - if (!IS_X86_HYBRID && x86_pmu.intel_cap.pebs_output_pt_available) {
> + if (!is_hybrid() && x86_pmu.intel_cap.pebs_output_pt_available) {
> pr_cont("PEBS-via-PT, ");
> x86_get_pmu()->capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
> }
> --- a/arch/x86/events/perf_event.h
> +++ b/arch/x86/events/perf_event.h
> @@ -331,7 +331,7 @@ struct cpu_hw_events {
> /*
> * Hybrid PMU support
> */
> - int hybrid_pmu_idx;
> + struct pmu *pmu;
> };
>
> #define __EVENT_CONSTRAINT_RANGE(c, e, n, m, w, o, f) { \
> @@ -671,21 +671,30 @@ struct x86_hybrid_pmu {
> struct extra_reg *extra_regs;
> };
>
> -#define IS_X86_HYBRID cpu_feature_enabled(X86_FEATURE_HYBRID_CPU)
> -
> -#define HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) \
> - (IS_X86_HYBRID && \
> - ((_cpuc)->hybrid_pmu_idx >= X86_HYBRID_PMU_ATOM_IDX) && \
> - ((_cpuc)->hybrid_pmu_idx < X86_HYBRID_PMU_MAX_INDEX))
> +static __always_inline bool is_hybrid(void)
> +{
> + return unlikely(cpu_feature_enabled(X86_FEATURE_HYBRID_CPU));
> +}
>
> -#define X86_HYBRID_READ_FROM_CPUC(_name, _cpuc) \
> - (_cpuc && HAS_VALID_HYBRID_PMU_IN_CPUC(_cpuc) ? x86_pmu.hybrid_pmu[(_cpuc)->hybrid_pmu_idx]._name : x86_pmu._name)
> +static __always_inline bool is_hybrid_idx(int idx)
> +{
> + return (unsigned)idx < X86_HYBRID_PMU_MAX_INDEX;
> +}
>
> -#define X86_HYBRID_READ_FROM_EVENT(_name, _event) \
> - (IS_X86_HYBRID ? ((struct x86_hybrid_pmu *)(_event->pmu))->_name : x86_pmu._name)
> +static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *)
> +{
> + return container_of(pmu, struct x86_hybrid_pmu, pmu);
> +}
>
> -#define IS_VALID_HYBRID_PMU_IDX(idx) \
> - (idx < X86_HYBRID_PMU_MAX_INDEX && idx > X86_NON_HYBRID_PMU)
> +#define hybrid(_pmu, _field) \
> +({ \
> + typeof(x86_pmu._field) __F = x86_pmu._field; \
> + \
> + if (is_hybrid() && (_pmu)) \
> + __F = hybrid_pmu(_pmu)->_field; \
> + \
> + __F; \
> +})
>
> static inline enum x86_hybrid_pmu_type_idx
> x86_hybrid_get_idx_from_cpu(unsigned int cpu)
> @@ -898,9 +907,12 @@ struct x86_pmu {
> * for all PMUs. The hybrid_pmu only includes the unique capabilities.
> * The hybrid_pmu_bitmap is the bits map of the possible hybrid_pmu.
> */
> + int (*filter_match)(struct perf_event *event);
> unsigned long hybrid_pmu_bitmap;
> + /*
> + * This thing is huge, use dynamic allocation!
> + */
> struct x86_hybrid_pmu hybrid_pmu[X86_HYBRID_PMU_MAX_INDEX];
> - int (*filter_match)(struct perf_event *event);
> };
>
> struct x86_perf_task_context_opt {
>