2021-04-05 22:51:57

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 00/25] Add Alder Lake support for perf (kernel)

From: Kan Liang <[email protected]>

Changes since V4:
- Put the X86_HYBRID_CPU_TYPE_ID_SHIFT over the function where it is
used (Boris) (Patch 2)
- Add Acked-by from Boris for Patch 1 & 2
- Fix a smatch warning, "allocate_fake_cpuc() warn: possible memory
leak of 'cpuc'" (0-DAY test) (Patch 16)

Changes since V3:
- Check whether the supported_cpus is empty in allocate_fake_cpuc().
A user may offline all the CPUs of a certain type. Perf should not
create an event for that PMU. (Patch 16)
- Don't clear a cpuc->pmu when the cpu is offlined in intel_pmu_cpu_dead().
We never unregister a PMU, even all the CPUs of a certain type are
offlined. A cpuc->pmu should be always valid and unchanged. There is no
harm to keep the pointer of the PMU. Also, some functions, e.g.,
release_lbr_buffers(), require a valid cpuc->pmu for each possible CPU.
(Patch 16)
- ADL may have an alternative configuration. With that configuration
X86_FEATURE_HYBRID_CPU is not set. Perf cannot retrieve the core type
from the CPUID leaf 0x1a either.
Use the number of hybrid PMUs, which implies a hybrid system, to replace
the check of the X86_FEATURE_HYBRID_CPU. (Patch 4)
Introduce a platform specific get_hybrid_cpu_type to retrieve the core
type if the generic one doesn't return a valid core type. (Patch 16 & 20)

Changes since V2:
- Don't show "hybrid_cpu" in /proc/cpuinfo (Boris) (Patch 1)
- Use get_this_hybrid_cpu_type() to replace get_hybrid_cpu_type() to
avoid the trouble of IPIs. The new function retrieves the type of the
current hybrid CPU. It's good enough for perf. (Dave) (Patch 2)
- Remove definitions for Atom and Core CPU types. Perf will define a
enum for the hybrid CPU type in the perf_event.h (Peter) (Patch 2 & 16)
- Remove X86_HYBRID_CPU_NATIVE_MODEL_ID_MASK. Not used in the patch set
(Kan)(Patch 2)
- Update the description of the patch 2 accordingly. (Boris) (Patch 2)
- All the hybrid PMUs are registered at boot time. (Peter) (Patch 16)
- Align all ATTR things. (Peter) (Patch 20)
- The patchset doesn't change the caps/pmu_name. The perf tool doesn't
rely on it to distinguish the event list. The caps/pmu_name is only to
indicate the microarchitecture, which is the hybrid Alder Lake for
both PMUs.

Changes since V1:
- Drop all user space patches, which will be reviewed later separately.
- Don't save the CPU type in struct cpuinfo_x86. Instead, provide helper
functions to get parameters of hybrid CPUs. (Boris)
- Rework the perf kernel patches according to Peter's suggestion. The
key changes include,
- Code style changes. Drop all the macro which names in capital
letters.
- Drop the hybrid PMU index, track the pointer of the hybrid PMU in
the per-CPU struct cpu_hw_events.
- Fix the x86_get_pmu() support
- Fix the allocate_fake_cpuc() support
- Fix validate_group() support
- Dynamically allocate the *hybrid_pmu for each hybrid PMU

Alder Lake uses a hybrid architecture utilizing Golden Cove cores
and Gracemont cores. On such architectures, all CPUs support the same,
homogeneous and symmetric, instruction set. Also, CPUID enumerate
the same features for all CPUs. There may be model-specific differences,
such as those addressed in this patchset.

The first two patches enumerate the hybrid CPU feature bit and provide
a helper function to get the CPU type of hybrid CPUs. (The initial idea
[1] was to save the CPU type in a new field x86_cpu_type in struct
cpuinfo_x86. Since the only user of the new field is perf, querying the
X86_FEATURE_HYBRID_CPU at the call site is a simpler alternative.[2])
Compared with the initial submission, the below two concerns[3][4] are
also addressed,
- Provide a good use case, PMU.
- Clarify what Intel Hybrid Technology is and is not.

The PMU capabilities for Golden Cove core and Gracemont core are not the
same. The key differences include the number of counters, events, perf
metrics feature, and PEBS-via-PT feature. A dedicated hybrid PMU has to
be registered for each of them. However, the current perf X86 assumes
that there is only one CPU PMU. To handle the hybrid PMUs, the patchset
- Introduce a new struct x86_hybrid_pmu to save the unique capabilities
from different PMUs. It's part of the global x86_pmu. The architecture
capabilities, which are available for all PMUs, are still saved in
the global x86_pmu. To save the space, the x86_hybrid_pmu is
dynamically allocated.
- The hybrid PMU registration has been moved to the cpu_starting(),
because only boot CPU is available when invoking the
init_hw_perf_events().
- Hybrid PMUs have different events and formats. Add new structures and
helpers for events attribute and format attribute which take the PMU
type into account.
- Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
PERF_TYPE_HW_CACHE_PMU to facilitate user space tools

The uncore, MSR and cstate are the same between hybrid CPUs.
Don't need to register hybrid PMUs for them.

The generic code kernel/events/core.c is not hybrid friendly either,
especially for the per-task monitoring. Peter once proposed a
patchset[5], but it hasn't been merged. This patchset doesn't intend to
improve the generic code (which can be improved later separately). It
still uses the capability PERF_PMU_CAP_HETEROGENEOUS_CPUS for each
hybrid PMUs. For per-task and system-wide monitoring, user space tools
have to create events on all available hybrid PMUs. The events which are
from different hybrid PMUs cannot be included in the same group.

[1]. https://lore.kernel.org/lkml/[email protected]/
[2]. https://lore.kernel.org/lkml/[email protected]/
[3]. https://lore.kernel.org/lkml/[email protected]/
[4]. https://lore.kernel.org/lkml/[email protected]/
[5]. https://lkml.kernel.org/r/[email protected]/

The kernel codes can also be found at
https://github.com/kliang2/perf.git adl_enabling

Kan Liang (22):
perf/x86: Track pmu in per-CPU cpu_hw_events
perf/x86/intel: Hybrid PMU support for perf capabilities
perf/x86: Hybrid PMU support for intel_ctrl
perf/x86: Hybrid PMU support for counters
perf/x86: Hybrid PMU support for unconstrained
perf/x86: Hybrid PMU support for hardware cache event
perf/x86: Hybrid PMU support for event constraints
perf/x86: Hybrid PMU support for extra_regs
perf/x86/intel: Factor out intel_pmu_check_num_counters
perf/x86/intel: Factor out intel_pmu_check_event_constraints
perf/x86/intel: Factor out intel_pmu_check_extra_regs
perf/x86: Remove temporary pmu assignment in event_init
perf/x86: Factor out x86_pmu_show_pmu_cap
perf/x86: Register hybrid PMUs
perf/x86: Add structures for the attributes of Hybrid PMUs
perf/x86/intel: Add attr_update for Hybrid PMUs
perf/x86: Support filter_match callback
perf/x86/intel: Add Alder Lake Hybrid support
perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU
perf/x86/intel/uncore: Add Alder Lake support
perf/x86/msr: Add Alder Lake CPU support
perf/x86/cstate: Add Alder Lake CPU support

Ricardo Neri (2):
x86/cpufeatures: Enumerate Intel Hybrid Technology feature bit
x86/cpu: Add helper function to get the type of the current hybrid CPU

Zhang Rui (1):
perf/x86/rapl: Add support for Intel Alder Lake

arch/x86/events/core.c | 342 ++++++++++++++----
arch/x86/events/intel/core.c | 696 ++++++++++++++++++++++++++++++++-----
arch/x86/events/intel/cstate.c | 39 ++-
arch/x86/events/intel/ds.c | 32 +-
arch/x86/events/intel/lbr.c | 9 +-
arch/x86/events/intel/uncore.c | 7 +
arch/x86/events/intel/uncore.h | 1 +
arch/x86/events/intel/uncore_snb.c | 131 +++++++
arch/x86/events/msr.c | 2 +
arch/x86/events/perf_event.h | 108 +++++-
arch/x86/events/rapl.c | 2 +
arch/x86/include/asm/cpu.h | 6 +
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 3 +
arch/x86/kernel/cpu/intel.c | 16 +
include/linux/perf_event.h | 12 +
include/uapi/linux/perf_event.h | 26 ++
kernel/events/core.c | 14 +-
18 files changed, 1252 insertions(+), 195 deletions(-)

--
2.7.4


2021-04-05 22:52:04

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 19/25] perf/x86: Support filter_match callback

From: Kan Liang <[email protected]>

Implement filter_match callback for X86, which check whether an event is
schedulable on the current CPU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 10 ++++++++++
arch/x86/events/perf_event.h | 1 +
2 files changed, 11 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 16b4f6f..09922ee 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2650,6 +2650,14 @@ static int x86_pmu_aux_output_match(struct perf_event *event)
return 0;
}

+static int x86_pmu_filter_match(struct perf_event *event)
+{
+ if (x86_pmu.filter_match)
+ return x86_pmu.filter_match(event);
+
+ return 1;
+}
+
static struct pmu pmu = {
.pmu_enable = x86_pmu_enable,
.pmu_disable = x86_pmu_disable,
@@ -2677,6 +2685,8 @@ static struct pmu pmu = {
.check_period = x86_pmu_check_period,

.aux_output_match = x86_pmu_aux_output_match,
+
+ .filter_match = x86_pmu_filter_match,
};

void arch_perf_update_userpage(struct perf_event *event,
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index c1c90c3..f996686 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -870,6 +870,7 @@ struct x86_pmu {

int (*aux_output_match) (struct perf_event *event);

+ int (*filter_match)(struct perf_event *event);
/*
* Hybrid support
*
--
2.7.4

2021-04-05 22:52:05

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 20/25] perf/x86/intel: Add Alder Lake Hybrid support

From: Kan Liang <[email protected]>

Alder Lake Hybrid system has two different types of core, Golden Cove
core and Gracemont core. The Golden Cove core is registered to
"cpu_core" PMU. The Gracemont core is registered to "cpu_atom" PMU.

The difference between the two PMUs include:
- Number of GP and fixed counters
- Events
- The "cpu_core" PMU supports Topdown metrics.
The "cpu_atom" PMU supports PEBS-via-PT.

The "cpu_core" PMU is similar to the Sapphire Rapids PMU, but without
PMEM.
The "cpu_atom" PMU is similar to Tremont, but with different events,
event_constraints, extra_regs and number of counters.

The mem-loads AUX event workaround only applies to the Golden Cove core.

Users may disable all CPUs of the same CPU type on the command line or
in the BIOS. For this case, perf still register a PMU for the CPU type
but the CPU mask is 0.

Current caps/pmu_name is usually the microarch codename. Assign the
"alderlake_hybrid" to the caps/pmu_name of both PMUs to indicate the
hybrid Alder Lake microarchitecture.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 254 ++++++++++++++++++++++++++++++++++++++++++-
arch/x86/events/intel/ds.c | 7 ++
arch/x86/events/perf_event.h | 7 ++
3 files changed, 267 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 07af58c..2b553d9 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2076,6 +2076,14 @@ static struct extra_reg intel_tnt_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};

+static struct extra_reg intel_grt_extra_regs[] __read_mostly = {
+ /* must define OFFCORE_RSP_X first, see intel_fixup_er() */
+ INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x3fffffffffull, RSP_0),
+ INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0x3fffffffffull, RSP_1),
+ INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0),
+ EVENT_EXTRA_END
+};
+
#define KNL_OT_L2_HITE BIT_ULL(19) /* Other Tile L2 Hit */
#define KNL_OT_L2_HITF BIT_ULL(20) /* Other Tile L2 Hit */
#define KNL_MCDRAM_LOCAL BIT_ULL(21)
@@ -2430,6 +2438,16 @@ static int icl_set_topdown_event_period(struct perf_event *event)
return 0;
}

+static int adl_set_topdown_event_period(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+ if (pmu->cpu_type != hybrid_big)
+ return 0;
+
+ return icl_set_topdown_event_period(event);
+}
+
static inline u64 icl_get_metrics_event_value(u64 metric, u64 slots, int idx)
{
u32 val;
@@ -2570,6 +2588,17 @@ static u64 icl_update_topdown_event(struct perf_event *event)
x86_pmu.num_topdown_events - 1);
}

+static u64 adl_update_topdown_event(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+ if (pmu->cpu_type != hybrid_big)
+ return 0;
+
+ return icl_update_topdown_event(event);
+}
+
+
static void intel_pmu_read_topdown_event(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -3658,6 +3687,17 @@ static inline bool is_mem_loads_aux_event(struct perf_event *event)
return (event->attr.config & INTEL_ARCH_EVENT_MASK) == X86_CONFIG(.event=0x03, .umask=0x82);
}

+static inline bool require_mem_loads_aux_event(struct perf_event *event)
+{
+ if (!(x86_pmu.flags & PMU_FL_MEM_LOADS_AUX))
+ return false;
+
+ if (is_hybrid())
+ return hybrid_pmu(event->pmu)->cpu_type == hybrid_big;
+
+ return true;
+}
+
static inline bool intel_pmu_has_cap(struct perf_event *event, int idx)
{
union perf_capabilities *intel_cap;
@@ -3785,7 +3825,7 @@ static int intel_pmu_hw_config(struct perf_event *event)
* event. The rule is to simplify the implementation of the check.
* That's because perf cannot have a complete group at the moment.
*/
- if (x86_pmu.flags & PMU_FL_MEM_LOADS_AUX &&
+ if (require_mem_loads_aux_event(event) &&
(event->attr.sample_type & PERF_SAMPLE_DATA_SRC) &&
is_mem_loads_event(event)) {
struct perf_event *leader = event->group_leader;
@@ -4062,6 +4102,39 @@ tfa_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
return c;
}

+static struct event_constraint *
+adl_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
+ struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+ if (pmu->cpu_type == hybrid_big)
+ return spr_get_event_constraints(cpuc, idx, event);
+ else if (pmu->cpu_type == hybrid_small)
+ return tnt_get_event_constraints(cpuc, idx, event);
+
+ WARN_ON(1);
+ return &emptyconstraint;
+}
+
+static int adl_hw_config(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+
+ if (pmu->cpu_type == hybrid_big)
+ return hsw_hw_config(event);
+ else if (pmu->cpu_type == hybrid_small)
+ return intel_pmu_hw_config(event);
+
+ WARN_ON(1);
+ return -EOPNOTSUPP;
+}
+
+static u8 adl_get_hybrid_cpu_type(void)
+{
+ return hybrid_big;
+}
+
/*
* Broadwell:
*
@@ -4422,6 +4495,14 @@ static int intel_pmu_aux_output_match(struct perf_event *event)
return is_intel_pt_event(event);
}

+static int intel_pmu_filter_match(struct perf_event *event)
+{
+ struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
+ unsigned int cpu = smp_processor_id();
+
+ return cpumask_test_cpu(cpu, &pmu->supported_cpus);
+}
+
PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63");

PMU_FORMAT_ATTR(ldlat, "config1:0-15");
@@ -5124,6 +5205,84 @@ static const struct attribute_group *attr_update[] = {
NULL,
};

+EVENT_ATTR_STR_HYBRID(slots, slots_adl, "event=0x00,umask=0x4", hybrid_big);
+EVENT_ATTR_STR_HYBRID(topdown-retiring, td_retiring_adl, "event=0xc2,umask=0x0;event=0x00,umask=0x80", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(topdown-bad-spec, td_bad_spec_adl, "event=0x73,umask=0x0;event=0x00,umask=0x81", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(topdown-fe-bound, td_fe_bound_adl, "event=0x71,umask=0x0;event=0x00,umask=0x82", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(topdown-be-bound, td_be_bound_adl, "event=0x74,umask=0x0;event=0x00,umask=0x83", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(topdown-heavy-ops, td_heavy_ops_adl, "event=0x00,umask=0x84", hybrid_big);
+EVENT_ATTR_STR_HYBRID(topdown-br-mispredict, td_br_mis_adl, "event=0x00,umask=0x85", hybrid_big);
+EVENT_ATTR_STR_HYBRID(topdown-fetch-lat, td_fetch_lat_adl, "event=0x00,umask=0x86", hybrid_big);
+EVENT_ATTR_STR_HYBRID(topdown-mem-bound, td_mem_bound_adl, "event=0x00,umask=0x87", hybrid_big);
+
+static struct attribute *adl_hybrid_events_attrs[] = {
+ EVENT_PTR(slots_adl),
+ EVENT_PTR(td_retiring_adl),
+ EVENT_PTR(td_bad_spec_adl),
+ EVENT_PTR(td_fe_bound_adl),
+ EVENT_PTR(td_be_bound_adl),
+ EVENT_PTR(td_heavy_ops_adl),
+ EVENT_PTR(td_br_mis_adl),
+ EVENT_PTR(td_fetch_lat_adl),
+ EVENT_PTR(td_mem_bound_adl),
+ NULL,
+};
+
+/* Must be in IDX order */
+EVENT_ATTR_STR_HYBRID(mem-loads, mem_ld_adl, "event=0xd0,umask=0x5,ldlat=3;event=0xcd,umask=0x1,ldlat=3", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(mem-stores, mem_st_adl, "event=0xd0,umask=0x6;event=0xcd,umask=0x2", hybrid_big_small);
+EVENT_ATTR_STR_HYBRID(mem-loads-aux, mem_ld_aux_adl, "event=0x03,umask=0x82", hybrid_big);
+
+static struct attribute *adl_hybrid_mem_attrs[] = {
+ EVENT_PTR(mem_ld_adl),
+ EVENT_PTR(mem_st_adl),
+ EVENT_PTR(mem_ld_aux_adl),
+ NULL,
+};
+
+EVENT_ATTR_STR_HYBRID(tx-start, tx_start_adl, "event=0xc9,umask=0x1", hybrid_big);
+EVENT_ATTR_STR_HYBRID(tx-commit, tx_commit_adl, "event=0xc9,umask=0x2", hybrid_big);
+EVENT_ATTR_STR_HYBRID(tx-abort, tx_abort_adl, "event=0xc9,umask=0x4", hybrid_big);
+EVENT_ATTR_STR_HYBRID(tx-conflict, tx_conflict_adl, "event=0x54,umask=0x1", hybrid_big);
+EVENT_ATTR_STR_HYBRID(cycles-t, cycles_t_adl, "event=0x3c,in_tx=1", hybrid_big);
+EVENT_ATTR_STR_HYBRID(cycles-ct, cycles_ct_adl, "event=0x3c,in_tx=1,in_tx_cp=1", hybrid_big);
+EVENT_ATTR_STR_HYBRID(tx-capacity-read, tx_capacity_read_adl, "event=0x54,umask=0x80", hybrid_big);
+EVENT_ATTR_STR_HYBRID(tx-capacity-write, tx_capacity_write_adl, "event=0x54,umask=0x2", hybrid_big);
+
+static struct attribute *adl_hybrid_tsx_attrs[] = {
+ EVENT_PTR(tx_start_adl),
+ EVENT_PTR(tx_abort_adl),
+ EVENT_PTR(tx_commit_adl),
+ EVENT_PTR(tx_capacity_read_adl),
+ EVENT_PTR(tx_capacity_write_adl),
+ EVENT_PTR(tx_conflict_adl),
+ EVENT_PTR(cycles_t_adl),
+ EVENT_PTR(cycles_ct_adl),
+ NULL,
+};
+
+FORMAT_ATTR_HYBRID(in_tx, hybrid_big);
+FORMAT_ATTR_HYBRID(in_tx_cp, hybrid_big);
+FORMAT_ATTR_HYBRID(offcore_rsp, hybrid_big_small);
+FORMAT_ATTR_HYBRID(ldlat, hybrid_big_small);
+FORMAT_ATTR_HYBRID(frontend, hybrid_big);
+
+static struct attribute *adl_hybrid_extra_attr_rtm[] = {
+ FORMAT_HYBRID_PTR(in_tx),
+ FORMAT_HYBRID_PTR(in_tx_cp),
+ FORMAT_HYBRID_PTR(offcore_rsp),
+ FORMAT_HYBRID_PTR(ldlat),
+ FORMAT_HYBRID_PTR(frontend),
+ NULL,
+};
+
+static struct attribute *adl_hybrid_extra_attr[] = {
+ FORMAT_HYBRID_PTR(offcore_rsp),
+ FORMAT_HYBRID_PTR(ldlat),
+ FORMAT_HYBRID_PTR(frontend),
+ NULL,
+};
+
static bool is_attr_for_this_pmu(struct kobject *kobj, struct attribute *attr)
{
struct device *dev = kobj_to_dev(kobj);
@@ -5353,6 +5512,7 @@ __init int intel_pmu_init(void)
bool pmem = false;
int version, i;
char *name;
+ struct x86_hybrid_pmu *pmu;

if (!cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
@@ -5947,6 +6107,98 @@ __init int intel_pmu_init(void)
name = "sapphire_rapids";
break;

+ case INTEL_FAM6_ALDERLAKE:
+ case INTEL_FAM6_ALDERLAKE_L:
+ /*
+ * Alder Lake has 2 types of CPU, core and atom.
+ *
+ * Initialize the common PerfMon capabilities here.
+ */
+ x86_pmu.hybrid_pmu = kcalloc(X86_HYBRID_NUM_PMUS,
+ sizeof(struct x86_hybrid_pmu),
+ GFP_KERNEL);
+ if (!x86_pmu.hybrid_pmu)
+ return -ENOMEM;
+ x86_pmu.num_hybrid_pmus = X86_HYBRID_NUM_PMUS;
+
+ x86_pmu.late_ack = true;
+ x86_pmu.pebs_aliases = NULL;
+ x86_pmu.pebs_prec_dist = true;
+ x86_pmu.pebs_block = true;
+ x86_pmu.flags |= PMU_FL_HAS_RSP_1;
+ x86_pmu.flags |= PMU_FL_NO_HT_SHARING;
+ x86_pmu.flags |= PMU_FL_PEBS_ALL;
+ x86_pmu.flags |= PMU_FL_INSTR_LATENCY;
+ x86_pmu.flags |= PMU_FL_MEM_LOADS_AUX;
+ x86_pmu.lbr_pt_coexist = true;
+ intel_pmu_pebs_data_source_skl(false);
+ x86_pmu.num_topdown_events = 8;
+ x86_pmu.update_topdown_event = adl_update_topdown_event;
+ x86_pmu.set_topdown_event_period = adl_set_topdown_event_period;
+
+ x86_pmu.filter_match = intel_pmu_filter_match;
+ x86_pmu.get_event_constraints = adl_get_event_constraints;
+ x86_pmu.hw_config = adl_hw_config;
+ x86_pmu.limit_period = spr_limit_period;
+ x86_pmu.get_hybrid_cpu_type = adl_get_hybrid_cpu_type;
+ /*
+ * The rtm_abort_event is used to check whether to enable GPRs
+ * for the RTM abort event. Atom doesn't have the RTM abort
+ * event. There is no harmful to set it in the common
+ * x86_pmu.rtm_abort_event.
+ */
+ x86_pmu.rtm_abort_event = X86_CONFIG(.event=0xc9, .umask=0x04);
+
+ td_attr = adl_hybrid_events_attrs;
+ mem_attr = adl_hybrid_mem_attrs;
+ tsx_attr = adl_hybrid_tsx_attrs;
+ extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
+ adl_hybrid_extra_attr_rtm : adl_hybrid_extra_attr;
+
+ /* Initialize big core specific PerfMon capabilities.*/
+ pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
+ pmu->name = "cpu_core";
+ pmu->cpu_type = hybrid_big;
+ pmu->num_counters = x86_pmu.num_counters + 2;
+ pmu->num_counters_fixed = x86_pmu.num_counters_fixed + 1;
+ pmu->max_pebs_events = min_t(unsigned, MAX_PEBS_EVENTS, pmu->num_counters);
+ pmu->unconstrained = (struct event_constraint)
+ __EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
+ 0, pmu->num_counters, 0, 0);
+ pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
+ pmu->intel_cap.perf_metrics = 1;
+ pmu->intel_cap.pebs_output_pt_available = 0;
+
+ memcpy(pmu->hw_cache_event_ids, spr_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
+ memcpy(pmu->hw_cache_extra_regs, spr_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
+ pmu->event_constraints = intel_spr_event_constraints;
+ pmu->pebs_constraints = intel_spr_pebs_event_constraints;
+ pmu->extra_regs = intel_spr_extra_regs;
+
+ /* Initialize Atom core specific PerfMon capabilities.*/
+ pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX];
+ pmu->name = "cpu_atom";
+ pmu->cpu_type = hybrid_small;
+ pmu->num_counters = x86_pmu.num_counters;
+ pmu->num_counters_fixed = x86_pmu.num_counters_fixed;
+ pmu->max_pebs_events = x86_pmu.max_pebs_events;
+ pmu->unconstrained = (struct event_constraint)
+ __EVENT_CONSTRAINT(0, (1ULL << pmu->num_counters) - 1,
+ 0, pmu->num_counters, 0, 0);
+ pmu->intel_cap.capabilities = x86_pmu.intel_cap.capabilities;
+ pmu->intel_cap.perf_metrics = 0;
+ pmu->intel_cap.pebs_output_pt_available = 1;
+
+ memcpy(pmu->hw_cache_event_ids, glp_hw_cache_event_ids, sizeof(pmu->hw_cache_event_ids));
+ memcpy(pmu->hw_cache_extra_regs, tnt_hw_cache_extra_regs, sizeof(pmu->hw_cache_extra_regs));
+ pmu->hw_cache_event_ids[C(ITLB)][C(OP_READ)][C(RESULT_ACCESS)] = -1;
+ pmu->event_constraints = intel_slm_event_constraints;
+ pmu->pebs_constraints = intel_grt_pebs_event_constraints;
+ pmu->extra_regs = intel_grt_extra_regs;
+ pr_cont("Alderlake Hybrid events, ");
+ name = "alderlake_hybrid";
+ break;
+
default:
switch (x86_pmu.version) {
case 1:
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index f1402bc..2780cb5 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -779,6 +779,13 @@ struct event_constraint intel_glm_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};

+struct event_constraint intel_grt_pebs_event_constraints[] = {
+ /* Allow all events as PEBS with no flags */
+ INTEL_PLD_CONSTRAINT(0x5d0, 0xf),
+ INTEL_PSD_CONSTRAINT(0x6d0, 0xf),
+ EVENT_CONSTRAINT_END
+};
+
struct event_constraint intel_nehalem_pebs_event_constraints[] = {
INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */
INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index f996686..bfbecde 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -683,6 +683,11 @@ enum hybrid_pmu_type {
hybrid_big_small = hybrid_big | hybrid_small,
};

+#define X86_HYBRID_PMU_ATOM_IDX 0
+#define X86_HYBRID_PMU_CORE_IDX 1
+
+#define X86_HYBRID_NUM_PMUS 2
+
/*
* struct x86_pmu - generic x86 pmu
*/
@@ -1249,6 +1254,8 @@ extern struct event_constraint intel_glm_pebs_event_constraints[];

extern struct event_constraint intel_glp_pebs_event_constraints[];

+extern struct event_constraint intel_grt_pebs_event_constraints[];
+
extern struct event_constraint intel_nehalem_pebs_event_constraints[];

extern struct event_constraint intel_westmere_pebs_event_constraints[];
--
2.7.4

2021-04-05 22:55:32

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 25/25] perf/x86/rapl: Add support for Intel Alder Lake

From: Zhang Rui <[email protected]>

Alder Lake RAPL support is the same as previous Sky Lake.
Add Alder Lake model for RAPL.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
---
arch/x86/events/rapl.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/rapl.c b/arch/x86/events/rapl.c
index f42a704..84a1042 100644
--- a/arch/x86/events/rapl.c
+++ b/arch/x86/events/rapl.c
@@ -800,6 +800,8 @@ static const struct x86_cpu_id rapl_model_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &model_hsx),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &model_skl),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &model_skl),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &model_spr),
X86_MATCH_VENDOR_FAM(AMD, 0x17, &model_amd_fam17h),
X86_MATCH_VENDOR_FAM(HYGON, 0x18, &model_amd_fam17h),
--
2.7.4

2021-04-05 23:43:06

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 09/25] perf/x86: Hybrid PMU support for event constraints

From: Kan Liang <[email protected]>

The events are different among hybrid PMUs. Each hybrid PMU should use
its own event constraints.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 3 ++-
arch/x86/events/intel/core.c | 5 +++--
arch/x86/events/intel/ds.c | 5 +++--
arch/x86/events/perf_event.h | 2 ++
4 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d71ca69..b866867 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1526,6 +1526,7 @@ void perf_event_print_debug(void)
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
int num_counters = hybrid(cpuc->pmu, num_counters);
int num_counters_fixed = hybrid(cpuc->pmu, num_counters_fixed);
+ struct event_constraint *pebs_constraints = hybrid(cpuc->pmu, pebs_constraints);
unsigned long flags;
int idx;

@@ -1545,7 +1546,7 @@ void perf_event_print_debug(void)
pr_info("CPU#%d: status: %016llx\n", cpu, status);
pr_info("CPU#%d: overflow: %016llx\n", cpu, overflow);
pr_info("CPU#%d: fixed: %016llx\n", cpu, fixed);
- if (x86_pmu.pebs_constraints) {
+ if (pebs_constraints) {
rdmsrl(MSR_IA32_PEBS_ENABLE, pebs);
pr_info("CPU#%d: pebs: %016llx\n", cpu, pebs);
}
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 39f57ae..d304ba3 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3136,10 +3136,11 @@ struct event_constraint *
x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
struct perf_event *event)
{
+ struct event_constraint *event_constraints = hybrid(cpuc->pmu, event_constraints);
struct event_constraint *c;

- if (x86_pmu.event_constraints) {
- for_each_event_constraint(c, x86_pmu.event_constraints) {
+ if (event_constraints) {
+ for_each_event_constraint(c, event_constraints) {
if (constraint_match(c, event->hw.config)) {
event->hw.flags |= c->flags;
return c;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 312bf3b..f1402bc 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -959,13 +959,14 @@ struct event_constraint intel_spr_pebs_event_constraints[] = {

struct event_constraint *intel_pebs_constraints(struct perf_event *event)
{
+ struct event_constraint *pebs_constraints = hybrid(event->pmu, pebs_constraints);
struct event_constraint *c;

if (!event->attr.precise_ip)
return NULL;

- if (x86_pmu.pebs_constraints) {
- for_each_event_constraint(c, x86_pmu.pebs_constraints) {
+ if (pebs_constraints) {
+ for_each_event_constraint(c, pebs_constraints) {
if (constraint_match(c, event->hw.config)) {
event->hw.flags |= c->flags;
return c;
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 203c165..c32e9dc 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -649,6 +649,8 @@ struct x86_hybrid_pmu {
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];
+ struct event_constraint *event_constraints;
+ struct event_constraint *pebs_constraints;
};

static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
--
2.7.4

2021-04-05 23:44:03

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 12/25] perf/x86/intel: Factor out intel_pmu_check_event_constraints

From: Kan Liang <[email protected]>

Each Hybrid PMU has to check and update its own event constraints before
registration.

The intel_pmu_check_event_constraints will be reused later to check
the event constraints of each hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 82 +++++++++++++++++++++++++-------------------
1 file changed, 47 insertions(+), 35 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 9394646..53a2e2e 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5090,6 +5090,49 @@ static void intel_pmu_check_num_counters(int *num_counters,
*intel_ctrl |= fixed_mask << INTEL_PMC_IDX_FIXED;
}

+static void intel_pmu_check_event_constraints(struct event_constraint *event_constraints,
+ int num_counters,
+ int num_counters_fixed,
+ u64 intel_ctrl)
+{
+ struct event_constraint *c;
+
+ if (!event_constraints)
+ return;
+
+ /*
+ * event on fixed counter2 (REF_CYCLES) only works on this
+ * counter, so do not extend mask to generic counters
+ */
+ for_each_event_constraint(c, event_constraints) {
+ /*
+ * Don't extend the topdown slots and metrics
+ * events to the generic counters.
+ */
+ if (c->idxmsk64 & INTEL_PMC_MSK_TOPDOWN) {
+ /*
+ * Disable topdown slots and metrics events,
+ * if slots event is not in CPUID.
+ */
+ if (!(INTEL_PMC_MSK_FIXED_SLOTS & intel_ctrl))
+ c->idxmsk64 = 0;
+ c->weight = hweight64(c->idxmsk64);
+ continue;
+ }
+
+ if (c->cmask == FIXED_EVENT_FLAGS) {
+ /* Disabled fixed counters which are not in CPUID */
+ c->idxmsk64 &= intel_ctrl;
+
+ if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+ c->idxmsk64 |= (1ULL << num_counters) - 1;
+ }
+ c->idxmsk64 &=
+ ~(~0ULL << (INTEL_PMC_IDX_FIXED + num_counters_fixed));
+ c->weight = hweight64(c->idxmsk64);
+ }
+}
+
__init int intel_pmu_init(void)
{
struct attribute **extra_skl_attr = &empty_attrs;
@@ -5100,7 +5143,6 @@ __init int intel_pmu_init(void)
union cpuid10_edx edx;
union cpuid10_eax eax;
union cpuid10_ebx ebx;
- struct event_constraint *c;
unsigned int fixed_mask;
struct extra_reg *er;
bool pmem = false;
@@ -5738,40 +5780,10 @@ __init int intel_pmu_init(void)
if (x86_pmu.intel_cap.anythread_deprecated)
x86_pmu.format_attrs = intel_arch_formats_attr;

- if (x86_pmu.event_constraints) {
- /*
- * event on fixed counter2 (REF_CYCLES) only works on this
- * counter, so do not extend mask to generic counters
- */
- for_each_event_constraint(c, x86_pmu.event_constraints) {
- /*
- * Don't extend the topdown slots and metrics
- * events to the generic counters.
- */
- if (c->idxmsk64 & INTEL_PMC_MSK_TOPDOWN) {
- /*
- * Disable topdown slots and metrics events,
- * if slots event is not in CPUID.
- */
- if (!(INTEL_PMC_MSK_FIXED_SLOTS & x86_pmu.intel_ctrl))
- c->idxmsk64 = 0;
- c->weight = hweight64(c->idxmsk64);
- continue;
- }
-
- if (c->cmask == FIXED_EVENT_FLAGS) {
- /* Disabled fixed counters which are not in CPUID */
- c->idxmsk64 &= x86_pmu.intel_ctrl;
-
- if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
- c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
- }
- c->idxmsk64 &=
- ~(~0ULL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
- c->weight = hweight64(c->idxmsk64);
- }
- }
-
+ intel_pmu_check_event_constraints(x86_pmu.event_constraints,
+ x86_pmu.num_counters,
+ x86_pmu.num_counters_fixed,
+ x86_pmu.intel_ctrl);
/*
* Access LBR MSR may cause #GP under certain circumstances.
* E.g. KVM doesn't support LBR MSR
--
2.7.4

2021-04-05 23:51:35

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 05/25] perf/x86: Hybrid PMU support for intel_ctrl

From: Kan Liang <[email protected]>

The intel_ctrl is the counter mask of a PMU. The PMU counter information
may be different among hybrid PMUs, each hybrid PMU should use its own
intel_ctrl to check and access the counters.

When handling a certain hybrid PMU, apply the intel_ctrl from the
corresponding hybrid PMU.

When checking the HW existence, apply the PMU and number of counters
from the corresponding hybrid PMU as well. Perf will check the HW
existence for each Hybrid PMU before registration. Expose the
check_hw_exists() for a later patch.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 14 +++++++-------
arch/x86/events/intel/core.c | 14 +++++++++-----
arch/x86/events/perf_event.h | 10 ++++++++--
3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d3d3c6b..fc14697 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -230,7 +230,7 @@ static void release_pmc_hardware(void) {}

#endif

-static bool check_hw_exists(void)
+bool check_hw_exists(struct pmu *pmu, int num_counters, int num_counters_fixed)
{
u64 val, val_fail = -1, val_new= ~0;
int i, reg, reg_fail = -1, ret = 0;
@@ -241,7 +241,7 @@ static bool check_hw_exists(void)
* Check to see if the BIOS enabled any of the counters, if so
* complain and bail.
*/
- for (i = 0; i < x86_pmu.num_counters; i++) {
+ for (i = 0; i < num_counters; i++) {
reg = x86_pmu_config_addr(i);
ret = rdmsrl_safe(reg, &val);
if (ret)
@@ -255,13 +255,13 @@ static bool check_hw_exists(void)
}
}

- if (x86_pmu.num_counters_fixed) {
+ if (num_counters_fixed) {
reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
ret = rdmsrl_safe(reg, &val);
if (ret)
goto msr_fail;
- for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
- if (fixed_counter_disabled(i))
+ for (i = 0; i < num_counters_fixed; i++) {
+ if (fixed_counter_disabled(i, pmu))
continue;
if (val & (0x03 << i*4)) {
bios_fail = 1;
@@ -1547,7 +1547,7 @@ void perf_event_print_debug(void)
cpu, idx, prev_left);
}
for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
- if (fixed_counter_disabled(idx))
+ if (fixed_counter_disabled(idx, cpuc->pmu))
continue;
rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, pmc_count);

@@ -1992,7 +1992,7 @@ static int __init init_hw_perf_events(void)
pmu_check_apic();

/* sanity check that the hardware exists or is emulated */
- if (!check_hw_exists())
+ if (!check_hw_exists(&pmu, x86_pmu.num_counters, x86_pmu.num_counters_fixed))
return 0;

pr_cont("%s PMU driver.\n", x86_pmu.name);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 494b9bc..7cc2c45 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2153,10 +2153,11 @@ static void intel_pmu_disable_all(void)
static void __intel_pmu_enable_all(int added, bool pmi)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);

intel_pmu_lbr_enable_all(pmi);
wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL,
- x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask);
+ intel_ctrl & ~cpuc->intel_ctrl_guest_mask);

if (test_bit(INTEL_PMC_IDX_FIXED_BTS, cpuc->active_mask)) {
struct perf_event *event =
@@ -2709,6 +2710,7 @@ int intel_pmu_save_and_restart(struct perf_event *event)
static void intel_pmu_reset(void)
{
struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds);
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
unsigned long flags;
int idx;

@@ -2724,7 +2726,7 @@ static void intel_pmu_reset(void)
wrmsrl_safe(x86_pmu_event_addr(idx), 0ull);
}
for (idx = 0; idx < x86_pmu.num_counters_fixed; idx++) {
- if (fixed_counter_disabled(idx))
+ if (fixed_counter_disabled(idx, cpuc->pmu))
continue;
wrmsrl_safe(MSR_ARCH_PERFMON_FIXED_CTR0 + idx, 0ull);
}
@@ -2753,6 +2755,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int bit;
int handled = 0;
+ u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);

inc_irq_stat(apic_perf_irqs);

@@ -2798,7 +2801,7 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)

handled++;
x86_pmu.drain_pebs(regs, &data);
- status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
+ status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;

/*
* PMI throttle may be triggered, which stops the PEBS event.
@@ -3807,10 +3810,11 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
+ u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);

arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
- arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
- arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
+ arr[0].host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
+ arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
if (x86_pmu.flags & PMU_FL_PEBS_ALL)
arr[0].guest &= ~cpuc->pebs_enabled;
else
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index df65012d..f9b1eee 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -635,6 +635,7 @@ enum {
struct x86_hybrid_pmu {
struct pmu pmu;
union perf_capabilities intel_cap;
+ u64 intel_ctrl;
};

static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
@@ -1000,6 +1001,9 @@ static inline int x86_pmu_rdpmc_index(int index)
return x86_pmu.rdpmc_index ? x86_pmu.rdpmc_index(index) : index;
}

+bool check_hw_exists(struct pmu *pmu, int num_counters,
+ int num_counters_fixed);
+
int x86_add_exclusive(unsigned int what);

void x86_del_exclusive(unsigned int what);
@@ -1104,9 +1108,11 @@ ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);

-static inline bool fixed_counter_disabled(int i)
+static inline bool fixed_counter_disabled(int i, struct pmu *pmu)
{
- return !(x86_pmu.intel_ctrl >> (i + INTEL_PMC_IDX_FIXED));
+ u64 intel_ctrl = hybrid(pmu, intel_ctrl);
+
+ return !(intel_ctrl >> (i + INTEL_PMC_IDX_FIXED));
}

#ifdef CONFIG_CPU_SUP_AMD
--
2.7.4

2021-04-05 23:51:52

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 13/25] perf/x86/intel: Factor out intel_pmu_check_extra_regs

From: Kan Liang <[email protected]>

Each Hybrid PMU has to check and update its own extra registers before
registration.

The intel_pmu_check_extra_regs will be reused later to check the extra
registers of each hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 35 +++++++++++++++++++++--------------
1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 53a2e2e..d1a13e0 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5133,6 +5133,26 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
}
}

+static void intel_pmu_check_extra_regs(struct extra_reg *extra_regs)
+{
+ struct extra_reg *er;
+
+ /*
+ * Access extra MSR may cause #GP under certain circumstances.
+ * E.g. KVM doesn't support offcore event
+ * Check all extra_regs here.
+ */
+ if (!extra_regs)
+ return;
+
+ for (er = extra_regs; er->msr; er++) {
+ er->extra_msr_access = check_msr(er->msr, 0x11UL);
+ /* Disable LBR select mapping */
+ if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
+ x86_pmu.lbr_sel_map = NULL;
+ }
+}
+
__init int intel_pmu_init(void)
{
struct attribute **extra_skl_attr = &empty_attrs;
@@ -5144,7 +5164,6 @@ __init int intel_pmu_init(void)
union cpuid10_eax eax;
union cpuid10_ebx ebx;
unsigned int fixed_mask;
- struct extra_reg *er;
bool pmem = false;
int version, i;
char *name;
@@ -5801,19 +5820,7 @@ __init int intel_pmu_init(void)
if (x86_pmu.lbr_nr)
pr_cont("%d-deep LBR, ", x86_pmu.lbr_nr);

- /*
- * Access extra MSR may cause #GP under certain circumstances.
- * E.g. KVM doesn't support offcore event
- * Check all extra_regs here.
- */
- if (x86_pmu.extra_regs) {
- for (er = x86_pmu.extra_regs; er->msr; er++) {
- er->extra_msr_access = check_msr(er->msr, 0x11UL);
- /* Disable LBR select mapping */
- if ((er->idx == EXTRA_REG_LBR) && !er->extra_msr_access)
- x86_pmu.lbr_sel_map = NULL;
- }
- }
+ intel_pmu_check_extra_regs(x86_pmu.extra_regs);

/* Support full width counters using alternative MSR range */
if (x86_pmu.intel_cap.full_width_write) {
--
2.7.4

2021-04-05 23:52:02

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 16/25] perf/x86: Register hybrid PMUs

From: Kan Liang <[email protected]>

Different hybrid PMUs have different PMU capabilities and events. Perf
should registers a dedicated PMU for each of them.

To check the X86 event, perf has to go through all possible hybrid pmus.

All the hybrid PMUs are registered at boot time. Before the
registration, add intel_pmu_check_hybrid_pmus() to check and update the
counters information, the event constraints, the extra registers and the
unique capabilities for each hybrid PMUs.

Postpone the display of the PMU information and HW check to
CPU_STARTING, because the boot CPU is the only online CPU in the
init_hw_perf_events(). Perf doesn't know the availability of the other
PMUs. Perf should display the PMU information only if the counters of
the PMU are available.

One type of CPUs may be all offline. For this case, users can still
observe the PMU in /sys/devices, but its CPU mask is 0.

All hybrid PMUs have capability PERF_PMU_CAP_HETEROGENEOUS_CPUS.
The PMU name for hybrid PMUs will be "cpu_XXX", which will be assigned
later in a separated patch.

The PMU type id for the core PMU is still PERF_TYPE_RAW. For the other
hybrid PMUs, the PMU type id is not hard code.

The event->cpu must be compatitable with the supported CPUs of the PMU.
Add a check in the x86_pmu_event_init().

The events in a group must be from the same type of hybrid PMU.
The fake cpuc used in the validation must be from the supported CPU of
the event->pmu.

Perf may not retrieve a valid core type from get_this_hybrid_cpu_type().
For example, ADL may have an alternative configuration. With that
configuration, Perf cannot retrieve the core type from the CPUID leaf
0x1a. Add a platform specific get_hybrid_cpu_type(). If the generic way
fails, invoke the platform specific get_hybrid_cpu_type().

Suggested-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 138 +++++++++++++++++++++++++++++++++++++------
arch/x86/events/intel/core.c | 93 ++++++++++++++++++++++++++++-
arch/x86/events/perf_event.h | 14 +++++
3 files changed, 224 insertions(+), 21 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index f9d299b..901b52c 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -485,7 +485,7 @@ int x86_setup_perfctr(struct perf_event *event)
local64_set(&hwc->period_left, hwc->sample_period);
}

- if (attr->type == PERF_TYPE_RAW)
+ if (attr->type == event->pmu->type)
return x86_pmu_extra_regs(event->attr.config, event);

if (attr->type == PERF_TYPE_HW_CACHE)
@@ -620,7 +620,7 @@ int x86_pmu_hw_config(struct perf_event *event)
if (!event->attr.exclude_kernel)
event->hw.config |= ARCH_PERFMON_EVENTSEL_OS;

- if (event->attr.type == PERF_TYPE_RAW)
+ if (event->attr.type == event->pmu->type)
event->hw.config |= event->attr.config & X86_RAW_EVENT_MASK;

if (event->attr.sample_period && x86_pmu.limit_period) {
@@ -749,7 +749,17 @@ void x86_pmu_enable_all(int added)

static inline int is_x86_event(struct perf_event *event)
{
- return event->pmu == &pmu;
+ int i;
+
+ if (!is_hybrid())
+ return event->pmu == &pmu;
+
+ for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
+ if (event->pmu == &x86_pmu.hybrid_pmu[i].pmu)
+ return true;
+ }
+
+ return false;
}

struct pmu *x86_get_pmu(unsigned int cpu)
@@ -1998,6 +2008,23 @@ void x86_pmu_show_pmu_cap(int num_counters, int num_counters_fixed,
pr_info("... event mask: %016Lx\n", intel_ctrl);
}

+/*
+ * The generic code is not hybrid friendly. The hybrid_pmu->pmu
+ * of the first registered PMU is unconditionally assigned to
+ * each possible cpuctx->ctx.pmu.
+ * Update the correct hybrid PMU to the cpuctx->ctx.pmu.
+ */
+void x86_pmu_update_cpu_context(struct pmu *pmu, int cpu)
+{
+ struct perf_cpu_context *cpuctx;
+
+ if (!pmu->pmu_cpu_context)
+ return;
+
+ cpuctx = per_cpu_ptr(pmu->pmu_cpu_context, cpu);
+ cpuctx->ctx.pmu = pmu;
+}
+
static int __init init_hw_perf_events(void)
{
struct x86_pmu_quirk *quirk;
@@ -2058,8 +2085,11 @@ static int __init init_hw_perf_events(void)

pmu.attr_update = x86_pmu.attr_update;

- x86_pmu_show_pmu_cap(x86_pmu.num_counters, x86_pmu.num_counters_fixed,
- x86_pmu.intel_ctrl);
+ if (!is_hybrid()) {
+ x86_pmu_show_pmu_cap(x86_pmu.num_counters,
+ x86_pmu.num_counters_fixed,
+ x86_pmu.intel_ctrl);
+ }

if (!x86_pmu.read)
x86_pmu.read = _x86_pmu_read;
@@ -2089,9 +2119,46 @@ static int __init init_hw_perf_events(void)
if (err)
goto out1;

- err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
- if (err)
- goto out2;
+ if (!is_hybrid()) {
+ err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
+ if (err)
+ goto out2;
+ } else {
+ u8 cpu_type = get_this_hybrid_cpu_type();
+ struct x86_hybrid_pmu *hybrid_pmu;
+ bool registered = false;
+ int i;
+
+ if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
+ cpu_type = x86_pmu.get_hybrid_cpu_type();
+
+ for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
+ hybrid_pmu = &x86_pmu.hybrid_pmu[i];
+
+ hybrid_pmu->pmu = pmu;
+ hybrid_pmu->pmu.type = -1;
+ hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
+ hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
+
+ err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
+ (hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
+ if (err)
+ continue;
+
+ if (cpu_type == hybrid_pmu->cpu_type)
+ x86_pmu_update_cpu_context(&hybrid_pmu->pmu, raw_smp_processor_id());
+
+ registered = true;
+ }
+
+ if (!registered) {
+ pr_warn("Failed to register hybrid PMUs\n");
+ kfree(x86_pmu.hybrid_pmu);
+ x86_pmu.hybrid_pmu = NULL;
+ x86_pmu.num_hybrid_pmus = 0;
+ goto out2;
+ }
+ }

return 0;

@@ -2216,16 +2283,27 @@ static void free_fake_cpuc(struct cpu_hw_events *cpuc)
kfree(cpuc);
}

-static struct cpu_hw_events *allocate_fake_cpuc(void)
+static struct cpu_hw_events *allocate_fake_cpuc(struct pmu *event_pmu)
{
struct cpu_hw_events *cpuc;
- int cpu = raw_smp_processor_id();
+ int cpu;

cpuc = kzalloc(sizeof(*cpuc), GFP_KERNEL);
if (!cpuc)
return ERR_PTR(-ENOMEM);
cpuc->is_fake = 1;

+ if (is_hybrid()) {
+ struct x86_hybrid_pmu *h_pmu;
+
+ h_pmu = hybrid_pmu(event_pmu);
+ if (cpumask_empty(&h_pmu->supported_cpus))
+ goto error;
+ cpu = cpumask_first(&h_pmu->supported_cpus);
+ } else
+ cpu = raw_smp_processor_id();
+ cpuc->pmu = event_pmu;
+
if (intel_cpuc_prepare(cpuc, cpu))
goto error;

@@ -2244,7 +2322,7 @@ static int validate_event(struct perf_event *event)
struct event_constraint *c;
int ret = 0;

- fake_cpuc = allocate_fake_cpuc();
+ fake_cpuc = allocate_fake_cpuc(event->pmu);
if (IS_ERR(fake_cpuc))
return PTR_ERR(fake_cpuc);

@@ -2278,7 +2356,27 @@ static int validate_group(struct perf_event *event)
struct cpu_hw_events *fake_cpuc;
int ret = -EINVAL, n;

- fake_cpuc = allocate_fake_cpuc();
+ /*
+ * Reject events from different hybrid PMUs.
+ */
+ if (is_hybrid()) {
+ struct perf_event *sibling;
+ struct pmu *pmu = NULL;
+
+ if (is_x86_event(leader))
+ pmu = leader->pmu;
+
+ for_each_sibling_event(sibling, leader) {
+ if (!is_x86_event(sibling))
+ continue;
+ if (!pmu)
+ pmu = sibling->pmu;
+ else if (pmu != sibling->pmu)
+ return ret;
+ }
+ }
+
+ fake_cpuc = allocate_fake_cpuc(event->pmu);
if (IS_ERR(fake_cpuc))
return PTR_ERR(fake_cpuc);
/*
@@ -2306,16 +2404,18 @@ static int validate_group(struct perf_event *event)

static int x86_pmu_event_init(struct perf_event *event)
{
+ struct x86_hybrid_pmu *pmu = NULL;
int err;

- switch (event->attr.type) {
- case PERF_TYPE_RAW:
- case PERF_TYPE_HARDWARE:
- case PERF_TYPE_HW_CACHE:
- break;
-
- default:
+ if ((event->attr.type != event->pmu->type) &&
+ (event->attr.type != PERF_TYPE_HARDWARE) &&
+ (event->attr.type != PERF_TYPE_HW_CACHE))
return -ENOENT;
+
+ if (is_hybrid() && (event->cpu != -1)) {
+ pmu = hybrid_pmu(event->pmu);
+ if (!cpumask_test_cpu(event->cpu, &pmu->supported_cpus))
+ return -ENOENT;
}

err = __x86_pmu_event_init(event);
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index d1a13e0..27919ae 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3720,7 +3720,8 @@ static int intel_pmu_hw_config(struct perf_event *event)
event->hw.flags |= PERF_X86_EVENT_PEBS_VIA_PT;
}

- if (event->attr.type != PERF_TYPE_RAW)
+ if ((event->attr.type == PERF_TYPE_HARDWARE) ||
+ (event->attr.type == PERF_TYPE_HW_CACHE))
return 0;

/*
@@ -4218,12 +4219,62 @@ static void flip_smm_bit(void *data)
}
}

+static bool init_hybrid_pmu(int cpu)
+{
+ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+ u8 cpu_type = get_this_hybrid_cpu_type();
+ struct x86_hybrid_pmu *pmu = NULL;
+ int i;
+
+ if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
+ cpu_type = x86_pmu.get_hybrid_cpu_type();
+
+ for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
+ if (x86_pmu.hybrid_pmu[i].cpu_type == cpu_type) {
+ pmu = &x86_pmu.hybrid_pmu[i];
+ break;
+ }
+ }
+ if (WARN_ON_ONCE(!pmu || (pmu->pmu.type == -1))) {
+ cpuc->pmu = NULL;
+ return false;
+ }
+
+ /* Only check and dump the PMU information for the first CPU */
+ if (!cpumask_empty(&pmu->supported_cpus))
+ goto end;
+
+ if (!check_hw_exists(&pmu->pmu, pmu->num_counters, pmu->num_counters_fixed))
+ return false;
+
+ pr_info("%s PMU driver: ", pmu->name);
+
+ if (pmu->intel_cap.pebs_output_pt_available)
+ pr_cont("PEBS-via-PT ");
+
+ pr_cont("\n");
+
+ x86_pmu_show_pmu_cap(pmu->num_counters, pmu->num_counters_fixed,
+ pmu->intel_ctrl);
+
+end:
+ cpumask_set_cpu(cpu, &pmu->supported_cpus);
+ cpuc->pmu = &pmu->pmu;
+
+ x86_pmu_update_cpu_context(&pmu->pmu, cpu);
+
+ return true;
+}
+
static void intel_pmu_cpu_starting(int cpu)
{
struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
int core_id = topology_core_id(cpu);
int i;

+ if (is_hybrid() && !init_hybrid_pmu(cpu))
+ return;
+
init_debug_store_on_cpu(cpu);
/*
* Deal with CPUs that don't clear their LBRs on power-up.
@@ -4337,7 +4388,12 @@ void intel_cpuc_finish(struct cpu_hw_events *cpuc)

static void intel_pmu_cpu_dead(int cpu)
{
- intel_cpuc_finish(&per_cpu(cpu_hw_events, cpu));
+ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu);
+
+ intel_cpuc_finish(cpuc);
+
+ if (is_hybrid() && cpuc->pmu)
+ cpumask_clear_cpu(cpu, &hybrid_pmu(cpuc->pmu)->supported_cpus);
}

static void intel_pmu_sched_task(struct perf_event_context *ctx,
@@ -5153,6 +5209,36 @@ static void intel_pmu_check_extra_regs(struct extra_reg *extra_regs)
}
}

+static void intel_pmu_check_hybrid_pmus(u64 fixed_mask)
+{
+ struct x86_hybrid_pmu *pmu;
+ int i;
+
+ for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
+ pmu = &x86_pmu.hybrid_pmu[i];
+
+ intel_pmu_check_num_counters(&pmu->num_counters,
+ &pmu->num_counters_fixed,
+ &pmu->intel_ctrl,
+ fixed_mask);
+
+ if (pmu->intel_cap.perf_metrics) {
+ pmu->intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;
+ pmu->intel_ctrl |= INTEL_PMC_MSK_FIXED_SLOTS;
+ }
+
+ if (pmu->intel_cap.pebs_output_pt_available)
+ pmu->pmu.capabilities |= PERF_PMU_CAP_AUX_OUTPUT;
+
+ intel_pmu_check_event_constraints(pmu->event_constraints,
+ pmu->num_counters,
+ pmu->num_counters_fixed,
+ pmu->intel_ctrl);
+
+ intel_pmu_check_extra_regs(pmu->extra_regs);
+ }
+}
+
__init int intel_pmu_init(void)
{
struct attribute **extra_skl_attr = &empty_attrs;
@@ -5832,6 +5918,9 @@ __init int intel_pmu_init(void)
if (!is_hybrid() && x86_pmu.intel_cap.perf_metrics)
x86_pmu.intel_ctrl |= 1ULL << GLOBAL_CTRL_EN_PERF_METRICS;

+ if (is_hybrid())
+ intel_pmu_check_hybrid_pmus((u64)fixed_mask);
+
return 0;
}

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 1da91b7..35510a9 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -15,6 +15,7 @@
#include <linux/perf_event.h>

#include <asm/intel_ds.h>
+#include <asm/cpu.h>

/* To enable MSR tracing please use the generic trace points. */

@@ -634,6 +635,9 @@ enum {

struct x86_hybrid_pmu {
struct pmu pmu;
+ const char *name;
+ u8 cpu_type;
+ cpumask_t supported_cpus;
union perf_capabilities intel_cap;
u64 intel_ctrl;
int max_pebs_events;
@@ -672,6 +676,13 @@ static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
__F; \
})

+enum hybrid_pmu_type {
+ hybrid_big = 0x40,
+ hybrid_small = 0x20,
+
+ hybrid_big_small = hybrid_big | hybrid_small,
+};
+
/*
* struct x86_pmu - generic x86 pmu
*/
@@ -869,6 +880,7 @@ struct x86_pmu {
*/
int num_hybrid_pmus;
struct x86_hybrid_pmu *hybrid_pmu;
+ u8 (*get_hybrid_cpu_type) (void);
};

struct x86_perf_task_context_opt {
@@ -1086,6 +1098,8 @@ int x86_pmu_handle_irq(struct pt_regs *regs);
void x86_pmu_show_pmu_cap(int num_counters, int num_counters_fixed,
u64 intel_ctrl);

+void x86_pmu_update_cpu_context(struct pmu *pmu, int cpu);
+
extern struct event_constraint emptyconstraint;

extern struct event_constraint unconstrained;
--
2.7.4

2021-04-05 23:52:22

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 21/25] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU

From: Kan Liang <[email protected]>

Current Hardware events and Hardware cache events have special perf
types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't
pass the PMU type in the user interface. For a hybrid system, the perf
subsystem doesn't know which PMU the events belong to. The first capable
PMU will always be assigned to the events. The events never get a chance
to run on the other capable PMUs.

Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
PERF_TYPE_HW_CACHE_PMU. The PMU type ID is stored at attr.config[40:32].
Support the new types for X86.

Suggested-by: Andi Kleen <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 10 ++++++++--
include/uapi/linux/perf_event.h | 26 ++++++++++++++++++++++++++
kernel/events/core.c | 14 +++++++++++++-
3 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 09922ee..f8d1222 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -488,7 +488,7 @@ int x86_setup_perfctr(struct perf_event *event)
if (attr->type == event->pmu->type)
return x86_pmu_extra_regs(event->attr.config, event);

- if (attr->type == PERF_TYPE_HW_CACHE)
+ if ((attr->type == PERF_TYPE_HW_CACHE) || (attr->type == PERF_TYPE_HW_CACHE_PMU))
return set_ext_hw_attr(hwc, event);

if (attr->config >= x86_pmu.max_events)
@@ -2452,9 +2452,15 @@ static int x86_pmu_event_init(struct perf_event *event)

if ((event->attr.type != event->pmu->type) &&
(event->attr.type != PERF_TYPE_HARDWARE) &&
- (event->attr.type != PERF_TYPE_HW_CACHE))
+ (event->attr.type != PERF_TYPE_HW_CACHE) &&
+ (event->attr.type != PERF_TYPE_HARDWARE_PMU) &&
+ (event->attr.type != PERF_TYPE_HW_CACHE_PMU))
return -ENOENT;

+ if ((event->attr.type == PERF_TYPE_HARDWARE_PMU) ||
+ (event->attr.type == PERF_TYPE_HW_CACHE_PMU))
+ event->attr.config &= PERF_HW_CACHE_EVENT_MASK;
+
if (is_hybrid() && (event->cpu != -1)) {
pmu = hybrid_pmu(event->pmu);
if (!cpumask_test_cpu(event->cpu, &pmu->supported_cpus))
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index ad15e40..c0a511e 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -33,6 +33,8 @@ enum perf_type_id {
PERF_TYPE_HW_CACHE = 3,
PERF_TYPE_RAW = 4,
PERF_TYPE_BREAKPOINT = 5,
+ PERF_TYPE_HARDWARE_PMU = 6,
+ PERF_TYPE_HW_CACHE_PMU = 7,

PERF_TYPE_MAX, /* non-ABI */
};
@@ -95,6 +97,30 @@ enum perf_hw_cache_op_result_id {
};

/*
+ * attr.config layout for type PERF_TYPE_HARDWARE* and PERF_TYPE_HW_CACHE*
+ * PERF_TYPE_HARDWARE: 0xAA
+ * AA: hardware event ID
+ * PERF_TYPE_HW_CACHE: 0xCCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * PERF_TYPE_HARDWARE_PMU: 0xDD000000AA
+ * AA: hardware event ID
+ * DD: PMU type ID
+ * PERF_TYPE_HW_CACHE_PMU: 0xDD00CCBBAA
+ * AA: hardware cache ID
+ * BB: hardware cache op ID
+ * CC: hardware cache op result ID
+ * DD: PMU type ID
+ */
+#define PERF_HW_CACHE_ID_SHIFT 0
+#define PERF_HW_CACHE_OP_ID_SHIFT 8
+#define PERF_HW_CACHE_OP_RESULT_ID_SHIFT 16
+#define PERF_HW_CACHE_EVENT_MASK 0xffffff
+
+#define PERF_PMU_TYPE_SHIFT 32
+
+/*
* Special "software" events provided by the kernel, even if the hardware
* does not support performance events. These events measure various
* physical and sw events of the kernel (and allow the profiling of them as
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f079431..b8ab756 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11093,6 +11093,14 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
return ret;
}

+static bool perf_event_is_hw_pmu_type(struct perf_event *event)
+{
+ int type = event->attr.type;
+
+ return type == PERF_TYPE_HARDWARE_PMU ||
+ type == PERF_TYPE_HW_CACHE_PMU;
+}
+
static struct pmu *perf_init_event(struct perf_event *event)
{
int idx, type, ret;
@@ -11116,13 +11124,17 @@ static struct pmu *perf_init_event(struct perf_event *event)
if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE)
type = PERF_TYPE_RAW;

+ if (perf_event_is_hw_pmu_type(event))
+ type = event->attr.config >> PERF_PMU_TYPE_SHIFT;
+
again:
rcu_read_lock();
pmu = idr_find(&pmu_idr, type);
rcu_read_unlock();
if (pmu) {
ret = perf_try_init_event(pmu, event);
- if (ret == -ENOENT && event->attr.type != type) {
+ if (ret == -ENOENT && event->attr.type != type &&
+ !perf_event_is_hw_pmu_type(event)) {
type = event->attr.type;
goto again;
}
--
2.7.4

2021-04-05 23:52:22

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 22/25] perf/x86/intel/uncore: Add Alder Lake support

From: Kan Liang <[email protected]>

The uncore subsystem for Alder Lake is similar to the previous Tiger
Lake.

The difference includes:
- New MSR addresses for global control, fixed counters, CBOX and ARB.
Add a new adl_uncore_msr_ops for uncore operations.
- Add a new threshold field for CBOX.
- New PCIIDs for IMC devices.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/uncore.c | 7 ++
arch/x86/events/intel/uncore.h | 1 +
arch/x86/events/intel/uncore_snb.c | 131 +++++++++++++++++++++++++++++++++++++
3 files changed, 139 insertions(+)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 35b3470..70816f3 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1740,6 +1740,11 @@ static const struct intel_uncore_init_fun rkl_uncore_init __initconst = {
.pci_init = skl_uncore_pci_init,
};

+static const struct intel_uncore_init_fun adl_uncore_init __initconst = {
+ .cpu_init = adl_uncore_cpu_init,
+ .mmio_init = tgl_uncore_mmio_init,
+};
+
static const struct intel_uncore_init_fun icx_uncore_init __initconst = {
.cpu_init = icx_uncore_cpu_init,
.pci_init = icx_uncore_pci_init,
@@ -1794,6 +1799,8 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &tgl_l_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &tgl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &rkl_uncore_init),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_uncore_init),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_uncore_init),
X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, &snr_uncore_init),
{},
};
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 549cfb2..426212f 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -575,6 +575,7 @@ void snb_uncore_cpu_init(void);
void nhm_uncore_cpu_init(void);
void skl_uncore_cpu_init(void);
void icl_uncore_cpu_init(void);
+void adl_uncore_cpu_init(void);
void tgl_uncore_cpu_init(void);
void tgl_uncore_mmio_init(void);
void tgl_l_uncore_mmio_init(void);
diff --git a/arch/x86/events/intel/uncore_snb.c b/arch/x86/events/intel/uncore_snb.c
index 5127128..0f63706 100644
--- a/arch/x86/events/intel/uncore_snb.c
+++ b/arch/x86/events/intel/uncore_snb.c
@@ -62,6 +62,8 @@
#define PCI_DEVICE_ID_INTEL_TGL_H_IMC 0x9a36
#define PCI_DEVICE_ID_INTEL_RKL_1_IMC 0x4c43
#define PCI_DEVICE_ID_INTEL_RKL_2_IMC 0x4c53
+#define PCI_DEVICE_ID_INTEL_ADL_1_IMC 0x4660
+#define PCI_DEVICE_ID_INTEL_ADL_2_IMC 0x4641

/* SNB event control */
#define SNB_UNC_CTL_EV_SEL_MASK 0x000000ff
@@ -131,12 +133,33 @@
#define ICL_UNC_ARB_PER_CTR 0x3b1
#define ICL_UNC_ARB_PERFEVTSEL 0x3b3

+/* ADL uncore global control */
+#define ADL_UNC_PERF_GLOBAL_CTL 0x2ff0
+#define ADL_UNC_FIXED_CTR_CTRL 0x2fde
+#define ADL_UNC_FIXED_CTR 0x2fdf
+
+/* ADL Cbo register */
+#define ADL_UNC_CBO_0_PER_CTR0 0x2002
+#define ADL_UNC_CBO_0_PERFEVTSEL0 0x2000
+#define ADL_UNC_CTL_THRESHOLD 0x3f000000
+#define ADL_UNC_RAW_EVENT_MASK (SNB_UNC_CTL_EV_SEL_MASK | \
+ SNB_UNC_CTL_UMASK_MASK | \
+ SNB_UNC_CTL_EDGE_DET | \
+ SNB_UNC_CTL_INVERT | \
+ ADL_UNC_CTL_THRESHOLD)
+
+/* ADL ARB register */
+#define ADL_UNC_ARB_PER_CTR0 0x2FD2
+#define ADL_UNC_ARB_PERFEVTSEL0 0x2FD0
+#define ADL_UNC_ARB_MSR_OFFSET 0x8
+
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(cmask5, cmask, "config:24-28");
DEFINE_UNCORE_FORMAT_ATTR(cmask8, cmask, "config:24-31");
+DEFINE_UNCORE_FORMAT_ATTR(threshold, threshold, "config:24-29");

/* Sandy Bridge uncore support */
static void snb_uncore_msr_enable_event(struct intel_uncore_box *box, struct perf_event *event)
@@ -422,6 +445,106 @@ void tgl_uncore_cpu_init(void)
skl_uncore_msr_ops.init_box = rkl_uncore_msr_init_box;
}

+static void adl_uncore_msr_init_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+}
+
+static void adl_uncore_msr_enable_box(struct intel_uncore_box *box)
+{
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, SNB_UNC_GLOBAL_CTL_EN);
+}
+
+static void adl_uncore_msr_disable_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, 0);
+}
+
+static void adl_uncore_msr_exit_box(struct intel_uncore_box *box)
+{
+ if (box->pmu->pmu_idx == 0)
+ wrmsrl(ADL_UNC_PERF_GLOBAL_CTL, 0);
+}
+
+static struct intel_uncore_ops adl_uncore_msr_ops = {
+ .init_box = adl_uncore_msr_init_box,
+ .enable_box = adl_uncore_msr_enable_box,
+ .disable_box = adl_uncore_msr_disable_box,
+ .exit_box = adl_uncore_msr_exit_box,
+ .disable_event = snb_uncore_msr_disable_event,
+ .enable_event = snb_uncore_msr_enable_event,
+ .read_counter = uncore_msr_read_counter,
+};
+
+static struct attribute *adl_uncore_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_threshold.attr,
+ NULL,
+};
+
+static const struct attribute_group adl_uncore_format_group = {
+ .name = "format",
+ .attrs = adl_uncore_formats_attr,
+};
+
+static struct intel_uncore_type adl_uncore_cbox = {
+ .name = "cbox",
+ .num_counters = 2,
+ .perf_ctr_bits = 44,
+ .perf_ctr = ADL_UNC_CBO_0_PER_CTR0,
+ .event_ctl = ADL_UNC_CBO_0_PERFEVTSEL0,
+ .event_mask = ADL_UNC_RAW_EVENT_MASK,
+ .msr_offset = ICL_UNC_CBO_MSR_OFFSET,
+ .ops = &adl_uncore_msr_ops,
+ .format_group = &adl_uncore_format_group,
+};
+
+static struct intel_uncore_type adl_uncore_arb = {
+ .name = "arb",
+ .num_counters = 2,
+ .num_boxes = 2,
+ .perf_ctr_bits = 44,
+ .perf_ctr = ADL_UNC_ARB_PER_CTR0,
+ .event_ctl = ADL_UNC_ARB_PERFEVTSEL0,
+ .event_mask = SNB_UNC_RAW_EVENT_MASK,
+ .msr_offset = ADL_UNC_ARB_MSR_OFFSET,
+ .constraints = snb_uncore_arb_constraints,
+ .ops = &adl_uncore_msr_ops,
+ .format_group = &snb_uncore_format_group,
+};
+
+static struct intel_uncore_type adl_uncore_clockbox = {
+ .name = "clock",
+ .num_counters = 1,
+ .num_boxes = 1,
+ .fixed_ctr_bits = 48,
+ .fixed_ctr = ADL_UNC_FIXED_CTR,
+ .fixed_ctl = ADL_UNC_FIXED_CTR_CTRL,
+ .single_fixed = 1,
+ .event_mask = SNB_UNC_CTL_EV_SEL_MASK,
+ .format_group = &icl_uncore_clock_format_group,
+ .ops = &adl_uncore_msr_ops,
+ .event_descs = icl_uncore_events,
+};
+
+static struct intel_uncore_type *adl_msr_uncores[] = {
+ &adl_uncore_cbox,
+ &adl_uncore_arb,
+ &adl_uncore_clockbox,
+ NULL,
+};
+
+void adl_uncore_cpu_init(void)
+{
+ adl_uncore_cbox.num_boxes = icl_get_cbox_num();
+ uncore_msr_uncores = adl_msr_uncores;
+}
+
enum {
SNB_PCI_UNCORE_IMC,
};
@@ -1203,6 +1326,14 @@ static const struct pci_device_id tgl_uncore_pci_ids[] = {
PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_TGL_H_IMC),
.driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
},
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ADL_1_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
+ { /* IMC */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ADL_2_IMC),
+ .driver_data = UNCORE_PCI_DEV_DATA(SNB_PCI_UNCORE_IMC, 0),
+ },
{ /* end: all zeroes */ }
};

--
2.7.4

2021-04-05 23:52:26

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 23/25] perf/x86/msr: Add Alder Lake CPU support

From: Kan Liang <[email protected]>

PPERF and SMI_COUNT MSRs are also supported on Alder Lake.

The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.

The patch has been tested on real hardware.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/msr.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 680404c..c853b28 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -100,6 +100,8 @@ static bool test_intel(int idx, void *data)
case INTEL_FAM6_TIGERLAKE_L:
case INTEL_FAM6_TIGERLAKE:
case INTEL_FAM6_ROCKETLAKE:
+ case INTEL_FAM6_ALDERLAKE:
+ case INTEL_FAM6_ALDERLAKE_L:
if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
return true;
break;
--
2.7.4

2021-04-06 00:10:44

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 24/25] perf/x86/cstate: Add Alder Lake CPU support

From: Kan Liang <[email protected]>

Compared with the Rocket Lake, the CORE C1 Residency Counter is added
for Alder Lake, but the CORE C3 Residency Counter is removed. Other
counters are the same.

Create a new adl_cstates for Alder Lake. Update the comments
accordingly.

The External Design Specification (EDS) is not published yet. It comes
from an authoritative internal source.

The patch has been tested on real hardware.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/cstate.c | 39 +++++++++++++++++++++++++++++----------
1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/arch/x86/events/intel/cstate.c b/arch/x86/events/intel/cstate.c
index 407eee5..4333990 100644
--- a/arch/x86/events/intel/cstate.c
+++ b/arch/x86/events/intel/cstate.c
@@ -40,7 +40,7 @@
* Model specific counters:
* MSR_CORE_C1_RES: CORE C1 Residency Counter
* perf code: 0x00
- * Available model: SLM,AMT,GLM,CNL,TNT
+ * Available model: SLM,AMT,GLM,CNL,TNT,ADL
* Scope: Core (each processor core has a MSR)
* MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
* perf code: 0x01
@@ -51,46 +51,49 @@
* perf code: 0x02
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Core
* MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
* perf code: 0x03
* Available model: SNB,IVB,HSW,BDW,SKL,CNL,KBL,CML,
- * ICL,TGL,RKL
+ * ICL,TGL,RKL,ADL
* Scope: Core
* MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter.
* perf code: 0x00
* Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
- * KBL,CML,ICL,TGL,TNT,RKL
+ * KBL,CML,ICL,TGL,TNT,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter.
* perf code: 0x01
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,KNL,
- * GLM,CNL,KBL,CML,ICL,TGL,TNT,RKL
+ * GLM,CNL,KBL,CML,ICL,TGL,TNT,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C6_RESIDENCY: Package C6 Residency Counter.
* perf code: 0x02
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter.
* perf code: 0x03
* Available model: NHM,WSM,SNB,IVB,HSW,BDW,SKL,CNL,
- * KBL,CML,ICL,TGL,RKL
+ * KBL,CML,ICL,TGL,RKL,ADL
* Scope: Package (physical package)
* MSR_PKG_C8_RESIDENCY: Package C8 Residency Counter.
* perf code: 0x04
- * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL
+ * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C9_RESIDENCY: Package C9 Residency Counter.
* perf code: 0x05
- * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL
+ * Available model: HSW ULT,KBL,CNL,CML,ICL,TGL,RKL,
+ * ADL
* Scope: Package (physical package)
* MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
* perf code: 0x06
* Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL,
- * TNT,RKL
+ * TNT,RKL,ADL
* Scope: Package (physical package)
*
*/
@@ -563,6 +566,20 @@ static const struct cstate_model icl_cstates __initconst = {
BIT(PERF_CSTATE_PKG_C10_RES),
};

+static const struct cstate_model adl_cstates __initconst = {
+ .core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
+ BIT(PERF_CSTATE_CORE_C6_RES) |
+ BIT(PERF_CSTATE_CORE_C7_RES),
+
+ .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
+ BIT(PERF_CSTATE_PKG_C3_RES) |
+ BIT(PERF_CSTATE_PKG_C6_RES) |
+ BIT(PERF_CSTATE_PKG_C7_RES) |
+ BIT(PERF_CSTATE_PKG_C8_RES) |
+ BIT(PERF_CSTATE_PKG_C9_RES) |
+ BIT(PERF_CSTATE_PKG_C10_RES),
+};
+
static const struct cstate_model slm_cstates __initconst = {
.core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
BIT(PERF_CSTATE_CORE_C6_RES),
@@ -650,6 +667,8 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE_L, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(TIGERLAKE, &icl_cstates),
X86_MATCH_INTEL_FAM6_MODEL(ROCKETLAKE, &icl_cstates),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE, &adl_cstates),
+ X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &adl_cstates),
{ },
};
MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
--
2.7.4

2021-04-06 00:10:44

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 18/25] perf/x86/intel: Add attr_update for Hybrid PMUs

From: Kan Liang <[email protected]>

The attribute_group for Hybrid PMUs should be different from the
previous
cpu PMU. For example, cpumask is required for a Hybrid PMU. The PMU type
should be included in the event and format attribute.

Add hybrid_attr_update for the Hybrid PMU.
Check the PMU type in is_visible() function. Only display the event or
format for the matched Hybrid PMU.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 120 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 114 insertions(+), 6 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 27919ae..07af58c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5124,6 +5124,106 @@ static const struct attribute_group *attr_update[] = {
NULL,
};

+static bool is_attr_for_this_pmu(struct kobject *kobj, struct attribute *attr)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ struct perf_pmu_events_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_events_hybrid_attr, attr.attr);
+
+ return pmu->cpu_type & pmu_attr->pmu_type;
+}
+
+static umode_t hybrid_events_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ return is_attr_for_this_pmu(kobj, attr) ? attr->mode : 0;
+}
+
+static inline int hybrid_find_supported_cpu(struct x86_hybrid_pmu *pmu)
+{
+ int cpu = cpumask_first(&pmu->supported_cpus);
+
+ return (cpu >= nr_cpu_ids) ? -1 : cpu;
+}
+
+static umode_t hybrid_tsx_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ int cpu = hybrid_find_supported_cpu(pmu);
+
+ return (cpu >= 0) && is_attr_for_this_pmu(kobj, attr) && cpu_has(&cpu_data(cpu), X86_FEATURE_RTM) ? attr->mode : 0;
+}
+
+static umode_t hybrid_format_is_visible(struct kobject *kobj,
+ struct attribute *attr, int i)
+{
+ struct device *dev = kobj_to_dev(kobj);
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+ struct perf_pmu_format_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_format_hybrid_attr, attr.attr);
+ int cpu = hybrid_find_supported_cpu(pmu);
+
+ return (cpu >= 0) && (pmu->cpu_type & pmu_attr->pmu_type) ? attr->mode : 0;
+}
+
+static struct attribute_group hybrid_group_events_td = {
+ .name = "events",
+ .is_visible = hybrid_events_is_visible,
+};
+
+static struct attribute_group hybrid_group_events_mem = {
+ .name = "events",
+ .is_visible = hybrid_events_is_visible,
+};
+
+static struct attribute_group hybrid_group_events_tsx = {
+ .name = "events",
+ .is_visible = hybrid_tsx_is_visible,
+};
+
+static struct attribute_group hybrid_group_format_extra = {
+ .name = "format",
+ .is_visible = hybrid_format_is_visible,
+};
+
+static ssize_t intel_hybrid_get_attr_cpus(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ struct x86_hybrid_pmu *pmu =
+ container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+
+ return cpumap_print_to_pagebuf(true, buf, &pmu->supported_cpus);
+}
+
+static DEVICE_ATTR(cpus, S_IRUGO, intel_hybrid_get_attr_cpus, NULL);
+static struct attribute *intel_hybrid_cpus_attrs[] = {
+ &dev_attr_cpus.attr,
+ NULL,
+};
+
+static struct attribute_group hybrid_group_cpus = {
+ .attrs = intel_hybrid_cpus_attrs,
+};
+
+static const struct attribute_group *hybrid_attr_update[] = {
+ &hybrid_group_events_td,
+ &hybrid_group_events_mem,
+ &hybrid_group_events_tsx,
+ &group_caps_gen,
+ &group_caps_lbr,
+ &hybrid_group_format_extra,
+ &group_default,
+ &hybrid_group_cpus,
+ NULL,
+};
+
static struct attribute *empty_attrs;

static void intel_pmu_check_num_counters(int *num_counters,
@@ -5867,14 +5967,22 @@ __init int intel_pmu_init(void)

snprintf(pmu_name_str, sizeof(pmu_name_str), "%s", name);

+ if (!is_hybrid()) {
+ group_events_td.attrs = td_attr;
+ group_events_mem.attrs = mem_attr;
+ group_events_tsx.attrs = tsx_attr;
+ group_format_extra.attrs = extra_attr;
+ group_format_extra_skl.attrs = extra_skl_attr;

- group_events_td.attrs = td_attr;
- group_events_mem.attrs = mem_attr;
- group_events_tsx.attrs = tsx_attr;
- group_format_extra.attrs = extra_attr;
- group_format_extra_skl.attrs = extra_skl_attr;
+ x86_pmu.attr_update = attr_update;
+ } else {
+ hybrid_group_events_td.attrs = td_attr;
+ hybrid_group_events_mem.attrs = mem_attr;
+ hybrid_group_events_tsx.attrs = tsx_attr;
+ hybrid_group_format_extra.attrs = extra_attr;

- x86_pmu.attr_update = attr_update;
+ x86_pmu.attr_update = hybrid_attr_update;
+ }

intel_pmu_check_num_counters(&x86_pmu.num_counters,
&x86_pmu.num_counters_fixed,
--
2.7.4

2021-04-06 00:10:45

by Liang, Kan

[permalink] [raw]
Subject: [PATCH V5 17/25] perf/x86: Add structures for the attributes of Hybrid PMUs

From: Kan Liang <[email protected]>

Hybrid PMUs have different events and formats. In theory, Hybrid PMU
specific attributes should be maintained in the dedicated struct
x86_hybrid_pmu, but it wastes space because the events and formats are
similar among Hybrid PMUs.

To reduce duplication, all hybrid PMUs will share a group of attributes
in the following patch. To distinguish an attribute from different
Hybrid PMUs, a PMU aware attribute structure is introduced. A PMU type
is required for the attribute structure. The type is internal usage. It
is not visible in the sysfs API.

Hybrid PMUs may support the same event name, but with different event
encoding, e.g., the mem-loads event on an Atom PMU has different event
encoding from a Core PMU. It brings issue if two attributes are
created for them. Current sysfs_update_group finds an attribute by
searching the attr name (aka event name). If two attributes have the
same event name, the first attribute will be replaced.
To address the issue, only one attribute is created for the event. The
event_str is extended and stores event encodings from all Hybrid PMUs.
Each event encoding is divided by ";". The order of the event encodings
must follow the order of the hybrid PMU index. The event_str is internal
usage as well. When a user wants to show the attribute of a Hybrid PMU,
only the corresponding part of the string is displayed.

Reviewed-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/core.c | 43 +++++++++++++++++++++++++++++++++++++++++++
arch/x86/events/perf_event.h | 19 +++++++++++++++++++
include/linux/perf_event.h | 12 ++++++++++++
3 files changed, 74 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 901b52c..16b4f6f 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1868,6 +1868,49 @@ ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
pmu_attr->event_str_noht);
}

+ssize_t events_hybrid_sysfs_show(struct device *dev,
+ struct device_attribute *attr,
+ char *page)
+{
+ struct perf_pmu_events_hybrid_attr *pmu_attr =
+ container_of(attr, struct perf_pmu_events_hybrid_attr, attr);
+ struct x86_hybrid_pmu *pmu;
+ const char *str, *next_str;
+ int i;
+
+ if (hweight64(pmu_attr->pmu_type) == 1)
+ return sprintf(page, "%s", pmu_attr->event_str);
+
+ /*
+ * Hybrid PMUs may support the same event name, but with different
+ * event encoding, e.g., the mem-loads event on an Atom PMU has
+ * different event encoding from a Core PMU.
+ *
+ * The event_str includes all event encodings. Each event encoding
+ * is divided by ";". The order of the event encodings must follow
+ * the order of the hybrid PMU index.
+ */
+ pmu = container_of(dev_get_drvdata(dev), struct x86_hybrid_pmu, pmu);
+
+ str = pmu_attr->event_str;
+ for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
+ if (!(x86_pmu.hybrid_pmu[i].cpu_type & pmu_attr->pmu_type))
+ continue;
+ if (x86_pmu.hybrid_pmu[i].cpu_type & pmu->cpu_type) {
+ next_str = strchr(str, ';');
+ if (next_str)
+ return snprintf(page, next_str - str + 1, "%s", str);
+ else
+ return sprintf(page, "%s", str);
+ }
+ str = strchr(str, ';');
+ str++;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(events_hybrid_sysfs_show);
+
EVENT_ATTR(cpu-cycles, CPU_CYCLES );
EVENT_ATTR(instructions, INSTRUCTIONS );
EVENT_ATTR(cache-references, CACHE_REFERENCES );
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 35510a9..c1c90c3 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -970,6 +970,22 @@ static struct perf_pmu_events_ht_attr event_attr_##v = { \
.event_str_ht = ht, \
}

+#define EVENT_ATTR_STR_HYBRID(_name, v, str, _pmu) \
+static struct perf_pmu_events_hybrid_attr event_attr_##v = { \
+ .attr = __ATTR(_name, 0444, events_hybrid_sysfs_show, NULL),\
+ .id = 0, \
+ .event_str = str, \
+ .pmu_type = _pmu, \
+}
+
+#define FORMAT_HYBRID_PTR(_id) (&format_attr_hybrid_##_id.attr.attr)
+
+#define FORMAT_ATTR_HYBRID(_name, _pmu) \
+static struct perf_pmu_format_hybrid_attr format_attr_hybrid_##_name = {\
+ .attr = __ATTR_RO(_name), \
+ .pmu_type = _pmu, \
+}
+
struct pmu *x86_get_pmu(unsigned int cpu);
extern struct x86_pmu x86_pmu __read_mostly;

@@ -1140,6 +1156,9 @@ ssize_t events_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);
ssize_t events_ht_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);
+ssize_t events_hybrid_sysfs_show(struct device *dev,
+ struct device_attribute *attr,
+ char *page);

static inline bool fixed_counter_disabled(int i, struct pmu *pmu)
{
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3f7f89e..b832e09 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1548,6 +1548,18 @@ struct perf_pmu_events_ht_attr {
const char *event_str_noht;
};

+struct perf_pmu_events_hybrid_attr {
+ struct device_attribute attr;
+ u64 id;
+ const char *event_str;
+ u64 pmu_type;
+};
+
+struct perf_pmu_format_hybrid_attr {
+ struct device_attribute attr;
+ u64 pmu_type;
+};
+
ssize_t perf_event_sysfs_show(struct device *dev, struct device_attribute *attr,
char *page);

--
2.7.4

2021-04-09 07:00:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 16/25] perf/x86: Register hybrid PMUs

On Mon, Apr 05, 2021 at 08:10:58AM -0700, [email protected] wrote:
> @@ -2089,9 +2119,46 @@ static int __init init_hw_perf_events(void)
> if (err)
> goto out1;
>
> - err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> - if (err)
> - goto out2;
> + if (!is_hybrid()) {
> + err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> + if (err)
> + goto out2;
> + } else {
> + u8 cpu_type = get_this_hybrid_cpu_type();
> + struct x86_hybrid_pmu *hybrid_pmu;
> + bool registered = false;
> + int i;
> +
> + if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
> + cpu_type = x86_pmu.get_hybrid_cpu_type();
> +
> + for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
> + hybrid_pmu = &x86_pmu.hybrid_pmu[i];
> +
> + hybrid_pmu->pmu = pmu;
> + hybrid_pmu->pmu.type = -1;
> + hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
> + hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
> +
> + err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
> + (hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
> + if (err)
> + continue;
> +
> + if (cpu_type == hybrid_pmu->cpu_type)
> + x86_pmu_update_cpu_context(&hybrid_pmu->pmu, raw_smp_processor_id());
> +
> + registered = true;
> + }
> +
> + if (!registered) {
> + pr_warn("Failed to register hybrid PMUs\n");
> + kfree(x86_pmu.hybrid_pmu);
> + x86_pmu.hybrid_pmu = NULL;
> + x86_pmu.num_hybrid_pmus = 0;
> + goto out2;
> + }

I don't think this is quite right. registered will be true even if one
fails, while I think you meant to only have it true when all (both)
types registered correctly.

> + }
>
> return 0;
>

2021-04-09 09:23:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 21/25] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU

On Mon, Apr 05, 2021 at 08:11:03AM -0700, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> Current Hardware events and Hardware cache events have special perf
> types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't
> pass the PMU type in the user interface. For a hybrid system, the perf
> subsystem doesn't know which PMU the events belong to. The first capable
> PMU will always be assigned to the events. The events never get a chance
> to run on the other capable PMUs.
>
> Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
> PERF_TYPE_HW_CACHE_PMU. The PMU type ID is stored at attr.config[40:32].
> Support the new types for X86.

Obviously ARM would need the same, but also, I don't think I see the
need to introduce new types. AFAICT there is nothing that stops this
scheme from working for the existing types.

Also, pmu type is 32bit, not 8bit.

So how about something like this?

---
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 3f7f89ea5e51..074c7687d466 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -260,15 +260,16 @@ struct perf_event;
/**
* pmu::capabilities flags
*/
-#define PERF_PMU_CAP_NO_INTERRUPT 0x01
-#define PERF_PMU_CAP_NO_NMI 0x02
-#define PERF_PMU_CAP_AUX_NO_SG 0x04
-#define PERF_PMU_CAP_EXTENDED_REGS 0x08
-#define PERF_PMU_CAP_EXCLUSIVE 0x10
-#define PERF_PMU_CAP_ITRACE 0x20
-#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x40
-#define PERF_PMU_CAP_NO_EXCLUDE 0x80
-#define PERF_PMU_CAP_AUX_OUTPUT 0x100
+#define PERF_PMU_CAP_NO_INTERRUPT 0x0001
+#define PERF_PMU_CAP_NO_NMI 0x0002
+#define PERF_PMU_CAP_AUX_NO_SG 0x0004
+#define PERF_PMU_CAP_EXTENDED_REGS 0x0008
+#define PERF_PMU_CAP_EXCLUSIVE 0x0010
+#define PERF_PMU_CAP_ITRACE 0x0020
+#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x0040
+#define PERF_PMU_CAP_NO_EXCLUDE 0x0080
+#define PERF_PMU_CAP_AUX_OUTPUT 0x0100
+#define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0200

struct perf_output_handle;

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f07943183041..910a0666ebfe 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11113,14 +11113,21 @@ static struct pmu *perf_init_event(struct perf_event *event)
* are often aliases for PERF_TYPE_RAW.
*/
type = event->attr.type;
- if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE)
- type = PERF_TYPE_RAW;
+ if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE) {
+ type = event->attr.config >> 32;
+ if (!type)
+ type = PERF_TYPE_RAW;
+ }

again:
rcu_read_lock();
pmu = idr_find(&pmu_idr, type);
rcu_read_unlock();
if (pmu) {
+ if (event->attr.type != type && type != PERF_TYPE_RAW &&
+ !(pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE))
+ goto fail;
+
ret = perf_try_init_event(pmu, event);
if (ret == -ENOENT && event->attr.type != type) {
type = event->attr.type;
@@ -11143,6 +11150,7 @@ static struct pmu *perf_init_event(struct perf_event *event)
goto unlock;
}
}
+fail:
pmu = ERR_PTR(-ENOENT);
unlock:
srcu_read_unlock(&pmus_srcu, idx);

2021-04-09 09:28:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 23/25] perf/x86/msr: Add Alder Lake CPU support

On Mon, Apr 05, 2021 at 08:11:05AM -0700, [email protected] wrote:
> From: Kan Liang <[email protected]>
>
> PPERF and SMI_COUNT MSRs are also supported on Alder Lake.
>
> The External Design Specification (EDS) is not published yet. It comes
> from an authoritative internal source.
>
> The patch has been tested on real hardware.
>
> Reviewed-by: Andi Kleen <[email protected]>
> Signed-off-by: Kan Liang <[email protected]>
> ---
> arch/x86/events/msr.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
> index 680404c..c853b28 100644
> --- a/arch/x86/events/msr.c
> +++ b/arch/x86/events/msr.c
> @@ -100,6 +100,8 @@ static bool test_intel(int idx, void *data)
> case INTEL_FAM6_TIGERLAKE_L:
> case INTEL_FAM6_TIGERLAKE:
> case INTEL_FAM6_ROCKETLAKE:
> + case INTEL_FAM6_ALDERLAKE:
> + case INTEL_FAM6_ALDERLAKE_L:
> if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
> return true;
> break;

If only it would be sanely enumerated... What about sapphire rapids?

2021-04-09 13:52:29

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V5 16/25] perf/x86: Register hybrid PMUs



On 4/9/2021 2:58 AM, Peter Zijlstra wrote:
> On Mon, Apr 05, 2021 at 08:10:58AM -0700, [email protected] wrote:
>> @@ -2089,9 +2119,46 @@ static int __init init_hw_perf_events(void)
>> if (err)
>> goto out1;
>>
>> - err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
>> - if (err)
>> - goto out2;
>> + if (!is_hybrid()) {
>> + err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
>> + if (err)
>> + goto out2;
>> + } else {
>> + u8 cpu_type = get_this_hybrid_cpu_type();
>> + struct x86_hybrid_pmu *hybrid_pmu;
>> + bool registered = false;
>> + int i;
>> +
>> + if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
>> + cpu_type = x86_pmu.get_hybrid_cpu_type();
>> +
>> + for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
>> + hybrid_pmu = &x86_pmu.hybrid_pmu[i];
>> +
>> + hybrid_pmu->pmu = pmu;
>> + hybrid_pmu->pmu.type = -1;
>> + hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
>> + hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
>> +
>> + err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
>> + (hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
>> + if (err)
>> + continue;
>> +
>> + if (cpu_type == hybrid_pmu->cpu_type)
>> + x86_pmu_update_cpu_context(&hybrid_pmu->pmu, raw_smp_processor_id());
>> +
>> + registered = true;
>> + }
>> +
>> + if (!registered) {
>> + pr_warn("Failed to register hybrid PMUs\n");
>> + kfree(x86_pmu.hybrid_pmu);
>> + x86_pmu.hybrid_pmu = NULL;
>> + x86_pmu.num_hybrid_pmus = 0;
>> + goto out2;
>> + }
>
> I don't think this is quite right. registered will be true even if one
> fails, while I think you meant to only have it true when all (both)
> types registered correctly.

No, I mean that perf error out only when all types fail to be registered.

For the case (1 failure, 1 success), users can still access the
registered PMU.
When a CPU belongs to the unregistered PMU online, a warning will be
displayed. Because in init_hybrid_pmu(), we will check the PMU type
before update the CPU mask.

if (WARN_ON_ONCE(!pmu || (pmu->pmu.type == -1))) {
cpuc->pmu = NULL;
return false;
}


Thanks,
Kan

2021-04-09 14:39:30

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V5 23/25] perf/x86/msr: Add Alder Lake CPU support



On 4/9/2021 5:24 AM, Peter Zijlstra wrote:
> On Mon, Apr 05, 2021 at 08:11:05AM -0700, [email protected] wrote:
>> From: Kan Liang <[email protected]>
>>
>> PPERF and SMI_COUNT MSRs are also supported on Alder Lake.
>>
>> The External Design Specification (EDS) is not published yet. It comes
>> from an authoritative internal source.
>>
>> The patch has been tested on real hardware.
>>
>> Reviewed-by: Andi Kleen <[email protected]>
>> Signed-off-by: Kan Liang <[email protected]>
>> ---
>> arch/x86/events/msr.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
>> index 680404c..c853b28 100644
>> --- a/arch/x86/events/msr.c
>> +++ b/arch/x86/events/msr.c
>> @@ -100,6 +100,8 @@ static bool test_intel(int idx, void *data)
>> case INTEL_FAM6_TIGERLAKE_L:
>> case INTEL_FAM6_TIGERLAKE:
>> case INTEL_FAM6_ROCKETLAKE:
>> + case INTEL_FAM6_ALDERLAKE:
>> + case INTEL_FAM6_ALDERLAKE_L:
>> if (idx == PERF_MSR_SMI || idx == PERF_MSR_PPERF)
>> return true;
>> break;
>
> If only it would be sanely enumerated... What about sapphire rapids?
>

SPR also supports the MSRs. I will submit a patch separately to support SPR.

Thanks,
Kan

2021-04-09 15:25:49

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V5 21/25] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU



On 4/9/2021 5:21 AM, Peter Zijlstra wrote:
> On Mon, Apr 05, 2021 at 08:11:03AM -0700, [email protected] wrote:
>> From: Kan Liang <[email protected]>
>>
>> Current Hardware events and Hardware cache events have special perf
>> types, PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE. The two types don't
>> pass the PMU type in the user interface. For a hybrid system, the perf
>> subsystem doesn't know which PMU the events belong to. The first capable
>> PMU will always be assigned to the events. The events never get a chance
>> to run on the other capable PMUs.
>>
>> Add a PMU aware version PERF_TYPE_HARDWARE_PMU and
>> PERF_TYPE_HW_CACHE_PMU. The PMU type ID is stored at attr.config[40:32].
>> Support the new types for X86.
>
> Obviously ARM would need the same, but also, I don't think I see the
> need to introduce new types. AFAICT there is nothing that stops this
> scheme from working for the existing types.
>
> Also, pmu type is 32bit, not 8bit.
>
> So how about something like this?
>
> ---
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 3f7f89ea5e51..074c7687d466 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -260,15 +260,16 @@ struct perf_event;
> /**
> * pmu::capabilities flags
> */
> -#define PERF_PMU_CAP_NO_INTERRUPT 0x01
> -#define PERF_PMU_CAP_NO_NMI 0x02
> -#define PERF_PMU_CAP_AUX_NO_SG 0x04
> -#define PERF_PMU_CAP_EXTENDED_REGS 0x08
> -#define PERF_PMU_CAP_EXCLUSIVE 0x10
> -#define PERF_PMU_CAP_ITRACE 0x20
> -#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x40
> -#define PERF_PMU_CAP_NO_EXCLUDE 0x80
> -#define PERF_PMU_CAP_AUX_OUTPUT 0x100
> +#define PERF_PMU_CAP_NO_INTERRUPT 0x0001
> +#define PERF_PMU_CAP_NO_NMI 0x0002
> +#define PERF_PMU_CAP_AUX_NO_SG 0x0004
> +#define PERF_PMU_CAP_EXTENDED_REGS 0x0008
> +#define PERF_PMU_CAP_EXCLUSIVE 0x0010
> +#define PERF_PMU_CAP_ITRACE 0x0020
> +#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x0040
> +#define PERF_PMU_CAP_NO_EXCLUDE 0x0080
> +#define PERF_PMU_CAP_AUX_OUTPUT 0x0100
> +#define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0200
>
> struct perf_output_handle;
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index f07943183041..910a0666ebfe 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -11113,14 +11113,21 @@ static struct pmu *perf_init_event(struct perf_event *event)
> * are often aliases for PERF_TYPE_RAW.
> */
> type = event->attr.type;
> - if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE)
> - type = PERF_TYPE_RAW;
> + if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE) {
> + type = event->attr.config >> 32;
> + if (!type)
> + type = PERF_TYPE_RAW;
> + }

For the old tool, the default PMU will be the big core. I think it's OK
for X86.

Since only the low 32 bit of event->attr.config contains the 'real'
config value, I think all the ARCHs will do event->attr.config &=
0xffffffff. Maybe we should move it to the generic code.

+ if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE) {
+ type = event->attr.config >> 32;
+ if (!type)
+ type = PERF_TYPE_RAW;
+ else
+ event->attr.config &= 0xffffffff;

> > again:
> rcu_read_lock();
> pmu = idr_find(&pmu_idr, type);
> rcu_read_unlock();
> if (pmu) {
> + if (event->attr.type != type && type != PERF_TYPE_RAW &&
> + !(pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE))
> + goto fail;
> +
> ret = perf_try_init_event(pmu, event);
> if (ret == -ENOENT && event->attr.type != type) {
> type = event->attr.type;


I don't think we want to go through all available PMUs again if users
already specify a PMU.

I update the patch a little bit. (Not test yet. I will do some tests then.)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 3158cbc..4f5f9a9 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2175,6 +2175,7 @@ static int __init init_hw_perf_events(void)
hybrid_pmu->pmu.type = -1;
hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
+ hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_EXTENDED_HW_TYPE;

err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
(hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index b832e09..391bfb7 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -269,6 +269,7 @@ struct perf_event;
#define PERF_PMU_CAP_HETEROGENEOUS_CPUS 0x40
#define PERF_PMU_CAP_NO_EXCLUDE 0x80
#define PERF_PMU_CAP_AUX_OUTPUT 0x100
+#define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x200

struct perf_output_handle;

diff --git a/include/uapi/linux/perf_event.h
b/include/uapi/linux/perf_event.h
index ad15e40..7ec80ac9 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -38,6 +38,20 @@ enum perf_type_id {
};

/*
+ * attr.config layout for type PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
+ * PERF_TYPE_HARDWARE: 0xEE000000AA
+ * AA: hardware event ID
+ * EE: PMU type ID
+ * PERF_TYPE_HW_CACHE: 0xEE00DDCCBB
+ * BB: hardware cache ID
+ * CC: hardware cache op ID
+ * DD: hardware cache op result ID
+ * EE: PMU type ID
+ * If the PMU type ID is 0, the PERF_TYPE_RAW will be applied.
+ */
+#define PERF_HW_EVENT_MASK 0xffffffff
+
+/*
* Generalized performance event event_id types, used by the
* attr.event_id parameter of the sys_perf_event_open()
* syscall:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f079431..9d9a792 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -11093,6 +11093,8 @@ static int perf_try_init_event(struct pmu *pmu,
struct perf_event *event)
return ret;
}

+#define PERF_EXTENDED_HW_TYPE (event->attr.config >> 32)
+
static struct pmu *perf_init_event(struct perf_event *event)
{
int idx, type, ret;
@@ -11113,16 +11115,25 @@ static struct pmu *perf_init_event(struct
perf_event *event)
* are often aliases for PERF_TYPE_RAW.
*/
type = event->attr.type;
- if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE)
- type = PERF_TYPE_RAW;
+ if (type == PERF_TYPE_HARDWARE || type == PERF_TYPE_HW_CACHE) {
+ type = PERF_EXTENDED_HW_TYPE;
+ if (!type)
+ type = PERF_TYPE_RAW;
+ else
+ event->attr.config &= PERF_HW_EVENT_MASK;
+ }

again:
rcu_read_lock();
pmu = idr_find(&pmu_idr, type);
rcu_read_unlock();
if (pmu) {
+ if (event->attr.type != type && type != PERF_TYPE_RAW &&
+ !(pmu->capabilities & PERF_PMU_CAP_EXTENDED_HW_TYPE))
+ goto fail;
+
ret = perf_try_init_event(pmu, event);
- if (ret == -ENOENT && event->attr.type != type) {
+ if (ret == -ENOENT && event->attr.type != type &&
!PERF_EXTENDED_HW_TYPE) {
type = event->attr.type;
goto again;
}
@@ -11143,6 +11154,7 @@ static struct pmu *perf_init_event(struct
perf_event *event)
goto unlock;
}
}
+fail:
pmu = ERR_PTR(-ENOENT);
unlock:
srcu_read_unlock(&pmus_srcu, idx);

Thanks,
Kan

2021-04-09 15:49:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 16/25] perf/x86: Register hybrid PMUs

On Fri, Apr 09, 2021 at 09:50:20AM -0400, Liang, Kan wrote:
>
>
> On 4/9/2021 2:58 AM, Peter Zijlstra wrote:
> > On Mon, Apr 05, 2021 at 08:10:58AM -0700, [email protected] wrote:
> > > @@ -2089,9 +2119,46 @@ static int __init init_hw_perf_events(void)
> > > if (err)
> > > goto out1;
> > > - err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> > > - if (err)
> > > - goto out2;
> > > + if (!is_hybrid()) {
> > > + err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
> > > + if (err)
> > > + goto out2;
> > > + } else {
> > > + u8 cpu_type = get_this_hybrid_cpu_type();
> > > + struct x86_hybrid_pmu *hybrid_pmu;
> > > + bool registered = false;
> > > + int i;
> > > +
> > > + if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
> > > + cpu_type = x86_pmu.get_hybrid_cpu_type();
> > > +
> > > + for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
> > > + hybrid_pmu = &x86_pmu.hybrid_pmu[i];
> > > +
> > > + hybrid_pmu->pmu = pmu;
> > > + hybrid_pmu->pmu.type = -1;
> > > + hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
> > > + hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
> > > +
> > > + err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
> > > + (hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
> > > + if (err)
> > > + continue;
> > > +
> > > + if (cpu_type == hybrid_pmu->cpu_type)
> > > + x86_pmu_update_cpu_context(&hybrid_pmu->pmu, raw_smp_processor_id());
> > > +
> > > + registered = true;
> > > + }
> > > +
> > > + if (!registered) {
> > > + pr_warn("Failed to register hybrid PMUs\n");
> > > + kfree(x86_pmu.hybrid_pmu);
> > > + x86_pmu.hybrid_pmu = NULL;
> > > + x86_pmu.num_hybrid_pmus = 0;
> > > + goto out2;
> > > + }
> >
> > I don't think this is quite right. registered will be true even if one
> > fails, while I think you meant to only have it true when all (both)
> > types registered correctly.
>
> No, I mean that perf error out only when all types fail to be registered.

All or nothing seems a better approach to me. There really isn't a good
reason for any one of them to fail.

2021-04-09 15:51:00

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH V5 16/25] perf/x86: Register hybrid PMUs



On 4/9/2021 11:45 AM, Peter Zijlstra wrote:
> On Fri, Apr 09, 2021 at 09:50:20AM -0400, Liang, Kan wrote:
>>
>>
>> On 4/9/2021 2:58 AM, Peter Zijlstra wrote:
>>> On Mon, Apr 05, 2021 at 08:10:58AM -0700, [email protected] wrote:
>>>> @@ -2089,9 +2119,46 @@ static int __init init_hw_perf_events(void)
>>>> if (err)
>>>> goto out1;
>>>> - err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
>>>> - if (err)
>>>> - goto out2;
>>>> + if (!is_hybrid()) {
>>>> + err = perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
>>>> + if (err)
>>>> + goto out2;
>>>> + } else {
>>>> + u8 cpu_type = get_this_hybrid_cpu_type();
>>>> + struct x86_hybrid_pmu *hybrid_pmu;
>>>> + bool registered = false;
>>>> + int i;
>>>> +
>>>> + if (!cpu_type && x86_pmu.get_hybrid_cpu_type)
>>>> + cpu_type = x86_pmu.get_hybrid_cpu_type();
>>>> +
>>>> + for (i = 0; i < x86_pmu.num_hybrid_pmus; i++) {
>>>> + hybrid_pmu = &x86_pmu.hybrid_pmu[i];
>>>> +
>>>> + hybrid_pmu->pmu = pmu;
>>>> + hybrid_pmu->pmu.type = -1;
>>>> + hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
>>>> + hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
>>>> +
>>>> + err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
>>>> + (hybrid_pmu->cpu_type == hybrid_big) ? PERF_TYPE_RAW : -1);
>>>> + if (err)
>>>> + continue;
>>>> +
>>>> + if (cpu_type == hybrid_pmu->cpu_type)
>>>> + x86_pmu_update_cpu_context(&hybrid_pmu->pmu, raw_smp_processor_id());
>>>> +
>>>> + registered = true;
>>>> + }
>>>> +
>>>> + if (!registered) {
>>>> + pr_warn("Failed to register hybrid PMUs\n");
>>>> + kfree(x86_pmu.hybrid_pmu);
>>>> + x86_pmu.hybrid_pmu = NULL;
>>>> + x86_pmu.num_hybrid_pmus = 0;
>>>> + goto out2;
>>>> + }
>>>
>>> I don't think this is quite right. registered will be true even if one
>>> fails, while I think you meant to only have it true when all (both)
>>> types registered correctly.
>>
>> No, I mean that perf error out only when all types fail to be registered.
>
> All or nothing seems a better approach to me. There really isn't a good
> reason for any one of them to fail.
>

Sure. I will change it in V6.

Thanks,
Kan

2021-04-09 15:53:02

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH V5 21/25] perf: Introduce PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU

On Fri, Apr 09, 2021 at 11:24:07AM -0400, Liang, Kan wrote:

> diff --git a/include/uapi/linux/perf_event.h
> b/include/uapi/linux/perf_event.h
> index ad15e40..7ec80ac9 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -38,6 +38,20 @@ enum perf_type_id {
> };
>
> /*
> + * attr.config layout for type PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
> + * PERF_TYPE_HARDWARE: 0xEE000000AA
> + * AA: hardware event ID
> + * EE: PMU type ID
> + * PERF_TYPE_HW_CACHE: 0xEE00DDCCBB
> + * BB: hardware cache ID
> + * CC: hardware cache op ID
> + * DD: hardware cache op result ID
> + * EE: PMU type ID

it's: 0xEEEEEEEE00DDCCBB and 0xEEEEEEEE000000AA