2010-01-20 09:22:14

by Tomasz Fujak

[permalink] [raw]
Subject: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

Hi,

While I managed to build and run the early version (back from December), I was unable to find the newest sources (infra + ARMv6, ARMv7 support).
Where do I find them?

The following patches provide a sysfs entry with hardware event human readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" % (event_value, minval, maxval, name, description) and means to populate the file.
The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in this matter.

The intended use is twofold: for users to read the list directly and for tools (like perf).

This series includes:
[PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry
[PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported

Thanks,
--
Tomasz Fujak


2010-01-20 09:11:56

by Tomasz Fujak

[permalink] [raw]
Subject: [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6, Cortex-A8 and Cortex-A9 exported

Signed-off-by: Tomasz Fujak <[email protected]>
Reviewed-by: Marek Szyprowski <[email protected]>
Reviewed-by: Kyungmin Park <[email protected]>

---
arch/arm/kernel/perf_event.c | 341 +++++++++++++++++++++++++++++++++++++++++-
1 files changed, 337 insertions(+), 4 deletions(-)

diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index 8d24be3..64573a2 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -26,6 +26,17 @@

static const struct pmu_irqs *pmu_irqs;

+#define PERF_EVENT_DESC_ENTRY(_val, _min, _max, _name, _desc) { \
+ .config = PERF_EVENT_RAW_TO_CONFIG(_val),\
+ .min_value = (_min),\
+ .max_value = (_max),\
+ .name = (_name),\
+ .description = (_desc)\
+}
+
+#define minv 0
+#define maxv 0
+
/*
* Hardware lock to serialize accesses to PMU registers. Needed for the
* read/modify/write sequences.
@@ -84,6 +95,7 @@ struct arm_pmu {

/* Set at runtime when we know what CPU type we are. */
static struct arm_pmu *armpmu;
+static LIST_HEAD(perf_events_arm);

#define HW_OP_UNSUPPORTED 0xFFFF

@@ -96,6 +108,17 @@ static unsigned armpmu_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX];

+static void
+perf_event_add_events(struct list_head *head,
+ struct perf_event_description *array,
+ unsigned int count)
+{
+ unsigned int idx = 0;
+
+ while (idx < count)
+ __list_add(&array[idx++].list, head->prev, head);
+}
+
static const int
armpmu_map_cache_event(u64 config)
{
@@ -673,6 +696,56 @@ static const unsigned armv6_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
},
};

+static struct perf_event_description armv6_event_description[] = {
+ /* armv6 events */
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_ICACHE_MISS, minv, maxv,
+ "ICACHE_MISS", "Instruction cache miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_IBUF_STALL, minv, maxv,
+ "IBUF_STALL", "Instruction fetch stall cycle"
+ " (either uTLB or I-cache miss)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DDEP_STALL, minv, maxv,
+ "DDEP_STALL", "Data dependency stall cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_ITLB_MISS, minv, maxv,
+ "ITLB_MISS", "Instruction uTLB miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DTLB_MISS, minv, maxv,
+ "DTLB_MISS", "Data uTLB miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_BR_EXEC, minv, maxv,
+ "BR_EXEC", "Branch instruction executed "
+ "(even if the PC hasn't been affected)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_BR_MISPREDICT, minv, maxv,
+ "BR_MISPREDICT", "Branch mispredicted"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_INSTR_EXEC, minv, maxv,
+ "INSTR_EXEC", "Instruction executed (may be incremented"
+ " by 2 on some occasion)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_HIT, minv, maxv,
+ "DCACHE_HIT", "Data cache hit for cacheable locations "
+ "(cache ops don't count)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_ACCESS, minv, maxv,
+ "DCACHE_ACCESS", "Data cache access, all locations (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_MISS, minv, maxv,
+ "DCACHE_MISS", "Data cache miss (cache ops don't count)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_DCACHE_WBACK, minv, maxv,
+ "DCACHE_WBACK", "Data cache writeback (once for "
+ "half a cache line)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_SW_PC_CHANGE, minv, maxv,
+ "SW_PC_CHANGE", "Software PC change (does not count if the "
+ "mode is changed, i.e. at SVC)"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_MAIN_TLB_MISS, minv, maxv,
+ "MAIN_TLB_MISS", "Main TLB (not uTLB) miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_EXPL_D_ACCESS, minv, maxv,
+ "EXPL_D_ACCESS", "Explicit external data access, DCache "
+ "linefill, Uncached, write-through"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_LSU_FULL_STALL, minv, maxv,
+ "LSU_FULL_STALL", "Stall cycle due to full Load/Store"
+ " Unit queue"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_WBUF_DRAINED, minv, maxv,
+ "WBUF_DRAINED", "Write buffer drained because of DSB or "
+ "Strongly Ordered memory operation"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_CPU_CYCLES, minv, maxv,
+ "CPU_CYCLES", "CPU cycles"),
+ PERF_EVENT_DESC_ENTRY(ARMV6_PERFCTR_NOP, minv, maxv, "NOP", "???")
+};
+
static inline unsigned long
armv6_pmcr_read(void)
{
@@ -1223,6 +1296,248 @@ static const unsigned armv7_a8_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
},
};

+static struct perf_event_description armv7_event_description[] = {
+ /* armv7 generic events */
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMNC_SW_INCR, minv, maxv,
+ "PMNC_SW_INCR", "Software increment (write to a "
+ "dedicated register)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_IFETCH_MISS, minv, maxv,
+ "IFETCH_MISS", "Instruction fetch miss that causes "
+ "refill. Speculative misses count unless they don't "
+ "make to the execution, maintenance operations don't"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ITLB_MISS, minv, maxv,
+ "ITLB_MISS", "Instruction TLB miss that causes a refill."
+ " Both speculative and explicit accesses count"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_REFILL, minv, maxv,
+ "DCACHE_REFILL", "Data cache refill. Same rules as ITLB_MISS"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_ACCESS, minv, maxv,
+ "DCACHE_ACCESS", "Data cache access. Same rules as ITLB_MISS"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DTLB_REFILL, minv, maxv,
+ "DTLB_REFILL", "Data TLB refill. Same rules as ITLB_MISS"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DREAD, minv, maxv, "DREAD",
+ "Data read executed (including SWP)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DWRITE, minv, maxv, "DWRITE",
+ "Data write executed (including SWP)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXC_TAKEN, minv, maxv,
+ "EXC_TAKEN", "Exception taken"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXC_EXECUTED, minv, maxv,
+ "EXC_EXECUTED", "Exception return executed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CID_WRITE, minv, maxv,
+ "CID_WRITE", "Context ID register written"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_WRITE, minv, maxv, "PC_WRITE",
+ "Software change of the PC (R15)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_IMM_BRANCH, minv, maxv,
+ "PC_IMM_BRANCH", "Immediate branch (B[L], BLX, CB[N]Z, HB[L],"
+ " HBLP), including conditional that fail"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_UNALIGNED_ACCESS, minv, maxv,
+ "UNALIGNED_ACCESS", "Data access unaligned to the transfer"
+ " size"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_MIS_PRED, minv, maxv,
+ "BRANCH_MISS_PRED", "Branch misprediction or not predicted"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CLOCK_CYCLES, minv, maxv,
+ "CLOCK_CYCLES", "Cycle count"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_MIS_USED, minv, maxv,
+ "BRANCH_MIS_USED", "Branch or other program flow change that "
+ "could have been predicted"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CPU_CYCLES, minv, maxv,
+ "CPU_CYCLES", "measures cpu cycles, the only allowed event"
+ " for the first counter")
+};
+
+static struct perf_event_description cortexa8_event_description[] = {
+ /* Cortex A8 specific events */
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INSTR_EXECUTED, minv, maxv,
+ "INSTR_EXECUTED", "Instruction executed (including conditional"
+ " that don't pass)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_PROC_RETURN, minv, maxv,
+ "PC_PROC_RETURN", "Procedure return (BX LR; MOV PC, LR; POP "
+ "{.., PC} and such)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_WRITE_BUFFER_FULL, minv, maxv,
+ "WRITE_BUFFER_FULL", "Write buffer full cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_STORE_MERGED, minv, maxv,
+ "L2_STORE_MERGED", "Store that is merged in the L2 memory"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_STORE_BUFF, minv, maxv,
+ "L2_STORE_BUFF", "A bufferable store from load/store to L2"
+ " cache, evictions and cast out data don't count (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_ACCESS, minv, maxv, "L2_ACCESS",
+ "L2 cache access"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_CACH_MISS, minv, maxv,
+ "L2_CACH_MISS", "L2 cache miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_AXI_READ_CYCLES, minv, maxv,
+ "AXI_READ_CYCLES", "AXI read data transfers"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_AXI_WRITE_CYCLES, minv, maxv,
+ "AXI_WRITE_CYCLES", "AXI write data transfers"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MEMORY_REPLAY, minv, maxv,
+ "MEMORY_REPLAY", "Replay event in the memory subsystem (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_UNALIGNED_ACCESS_REPLAY, minv, maxv,
+ "UNALIGNED_ACCESS_REPLAY", "An unaligned memory access that"
+ " results in a replay (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_DATA_MISS, minv, maxv,
+ "L1_DATA_MISS", "L1 data cache miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_INST_MISS, minv, maxv,
+ "L1_INST_MISS", "L1 instruction cache miss"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_DATA_COLORING, minv, maxv,
+ "L1_DATA_COLORING", "L1 access that triggers eviction or cast"
+ " out (page coloring alias)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_NEON_DATA, minv, maxv,
+ "L1_NEON_DATA", "A NEON access that hits the L1 DCache"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_NEON_CACH_DATA, minv, maxv,
+ "L1_NEON_CACH_DATA", "A cacheable NEON access that hits the"
+ " L1 DCache"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_NEON, minv, maxv, "L2_NEON",
+ "A NEON access memory access that results in L2 being"
+ " accessed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L2_NEON_HIT, minv, maxv,
+ "L2_NEON_HIT", "A NEON hit in the L2"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_L1_INST, minv, maxv, "L1_INST",
+ "A L1 instruction access (CP15 cache ops don't count)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_RETURN_MIS_PRED, minv, maxv,
+ "PC_RETURN_MIS_PRED", "A return stack misprediction because"
+ " of incorrect stack address"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_FAILED, minv, maxv,
+ "PC_BRANCH_FAILED", "Branch misprediction (both ways)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_TAKEN, minv, maxv,
+ "PC_BRANCH_TAKEN", "Predictable branch predicted taken"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PC_BRANCH_EXECUTED, minv, maxv,
+ "PC_BRANCH_EXECUTED", "Predictable branch executed taken"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_OP_EXECUTED, minv, maxv,
+ "OP_EXECUTED", "uOP executed (an instruction or a "
+ "multi-instruction step)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_INST_STALL, minv, maxv,
+ "CYCLES_INST_STALL", "Instruction issue unit idle cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_INST, minv, maxv,
+ "CYCLES_INST", "Instruction issued (multicycle instruction "
+ "counts for one)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_NEON_DATA_STALL, minv, maxv,
+ "CYCLES_NEON_DATA_STALL", "Cycles the CPU waits on MRC "
+ "from NEON"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_CYCLES_NEON_INST_STALL, minv, maxv,
+ "CYCLES_NEON_INST_STALL", "Stall cycles caused by full NEON"
+ " queue (either ins. queue or load queue)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_NEON_CYCLES, minv, maxv,
+ "NEON_CYCLES", "Cycles that both processors (ARM & NEON)"
+ " are not idle"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU0_EVENTS, minv, maxv,
+ "PMU0_EVENTS", "Event on external input source (PMUEXTIN[0])"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU1_EVENTS, minv, maxv,
+ "PMU1_EVENTS", "Event on external input source (PMUEXTIN[1])"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PMU_EVENTS, minv, maxv,
+ "PMU_EVENTS", "Event on either of the external input sources"
+ " (PMUEXTIN[0,1])")
+};
+
+static struct perf_event_description cortexa9_event_description[] = {
+ /* ARMv7 Cortex-A9 specific event types */
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAVA_HW_BYTECODE_EXEC, minv, maxv,
+ "JAVA_HW_BYTECODE_EXEC", "Java bytecode executed in HW"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAVA_SW_BYTECODE_EXEC, minv, maxv,
+ "JAVA_SW_BYTECODE_EXEC", "Java bytecode executed in SW"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_JAZELLE_BRANCH_EXEC, minv, maxv,
+ "JAZELLE_BRANCH_EXEC", "Jazelle backward branch"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_COHERENT_LINE_MISS, minv, maxv,
+ "COHERENT_LINE_MISS", "???"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_COHERENT_LINE_HIT, minv, maxv,
+ "COHERENT_LINE_HIT", "???"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ICACHE_DEP_STALL_CYCLES, minv,
+ maxv, "ICACHE_DEP_STALL_CYCLES", "Instruction cache "
+ "dependent stall"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DCACHE_DEP_STALL_CYCLES, minv,
+ maxv, "DCACHE_DEP_STALL_CYCLES", "Data cache dependent stall"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_TLB_MISS_DEP_STALL_CYCLES, minv,
+ maxv, "TLB_MISS_DEP_STALL_CYCLES", "Main TLB miss stall"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_STREX_EXECUTED_PASSED, minv, maxv,
+ "STREX_EXECUTED_PASSED", "STREX passed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_STREX_EXECUTED_FAILED, minv, maxv,
+ "STREX_EXECUTED_FAILED", "STREX failed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_EVICTION, minv, maxv,
+ "DATA_EVICTION", "Cache data eviction (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISSUE_STAGE_NO_INST, minv, maxv,
+ "ISSUE_STAGE_NO_INST", "No instruction issued cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISSUE_STAGE_EMPTY, minv, maxv,
+ "ISSUE_STAGE_EMPTY", "Empty issue unit cycles"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INST_OUT_OF_RENAME_STAGE, minv,
+ maxv, "INST_OUT_OF_RENAME_STAGE", "???"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PREDICTABLE_FUNCT_RETURNS, minv,
+ maxv, "PREDICTABLE_FUNCT_RETURNS", "Predictable return "
+ "occured (?)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MAIN_UNIT_EXECUTED_INST, minv,
+ maxv, "MAIN_UNIT_EXECUTED_INST", "Pipe 0 instruction "
+ "executed (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_SECOND_UNIT_EXECUTED_INST, minv,
+ maxv, "SECOND_UNIT_EXECUTED_INST", "Pipe 1 instruction "
+ "executed (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_LD_ST_UNIT_EXECUTED_INST, minv,
+ maxv, "LD_ST_UNIT_EXECUTED_INST", "Load/Store Unit instruction"
+ " executed (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_FP_EXECUTED_INST, minv, maxv,
+ "FP_EXECUTED_INST", "VFP instruction executed (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_NEON_EXECUTED_INST, minv, maxv,
+ "NEON_EXECUTED_INST", "NEON instruction executed (?)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLD_FULL_DEP_STALL_CYCLES,
+ minv, maxv, "PLD_FULL_DEP_STALL_CYCLES", "PLD stall cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_WR_DEP_STALL_CYCLES, minv,
+ maxv, "DATA_WR_DEP_STALL_CYCLES", "Write stall cycle"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ITLB_MISS_DEP_STALL_CYCLES, minv,
+ maxv, "ITLB_MISS_DEP_STALL_CYCLES", "Instruction stall due to"
+ " main TLB miss (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DTLB_MISS_DEP_STALL_CYCLES, minv,
+ maxv, "DTLB_MISS_DEP_STALL_CYCLES", "Data stall due to main TLB"
+ " miss (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MICRO_ITLB_MISS_DEP_STALL_CYCLES,
+ minv, maxv, "MICRO_ITLB_MISS_DEP_STALL_CYCLES", "Instruction "
+ "stall due to uTLB miss (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_MICRO_DTLB_MISS_DEP_STALL_CYCLES,
+ minv, maxv, "MICRO_DTLB_MISS_DEP_STALL_CYCLES", "Data stall "
+ "due to micro uTLB miss (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DMB_DEP_STALL_CYCLES, minv, maxv,
+ "DMB_DEP_STALL_CYCLES", "DMB stall (?)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_INTGR_CLK_ENABLED_CYCLES, minv,
+ maxv, "INTGR_CLK_ENABLED_CYCLES", "Integer core clock "
+ "disabled (?)"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DATA_ENGINE_CLK_EN_CYCLES, minv,
+ maxv, "DATA_ENGINE_CLK_EN_CYCLES", "Data engine clock disabled"
+ " (?)"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_ISB_INST, minv, maxv, "ISB_INST",
+ "ISB executed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DSB_INST, minv, maxv, "DSB_INST",
+ "DSB executed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_DMB_INST, minv, maxv, "DMB_INST",
+ "DMB executed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_EXT_INTERRUPTS, minv, maxv,
+ "EXT_INTERRUPTS", "External interrupt"),
+
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_CACHE_LINE_RQST_COMPLETED,
+ minv, maxv, "PLE_CACHE_LINE_RQST_COMPLETED", "PLE (Preload "
+ "engine) cache line request completed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_CACHE_LINE_RQST_SKIPPED, minv,
+ maxv, "PLE_CACHE_LINE_RQST_SKIPPED", "PLE cache line "
+ "request skipped"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_FIFO_FLUSH, minv, maxv,
+ "PLE_FIFO_FLUSH", "PLE FIFO flush"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_RQST_COMPLETED, minv, maxv,
+ "PLE_RQST_COMPLETED", "PLE request completed"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_FIFO_OVERFLOW, minv, maxv,
+ "PLE_FIFO_OVERFLOW", "PLE FIFO overflow"),
+ PERF_EVENT_DESC_ENTRY(ARMV7_PERFCTR_PLE_RQST_PROG, minv, maxv,
+ "PLE_RQST_PROG", "PLE request programmed")
+};
+
+
+/* ********************************************************** */
+
/*
* Cortex-A9 HW events mapping
*/
@@ -1798,6 +2113,11 @@ static struct arm_pmu armv7pmu = {
.max_period = (1LLU << 32) - 1,
};

+const struct list_head *hw_perf_event_get_list(void)
+{
+ return &perf_events_arm;
+}
+
static int __init
init_hw_perf_events(void)
{
@@ -1820,11 +2140,16 @@ init_hw_perf_events(void)
memcpy(armpmu_perf_cache_map, armv6_perf_cache_map,
sizeof(armv6_perf_cache_map));
perf_max_events = armv6pmu.num_events;
+
+ perf_event_add_events(&perf_events_arm, armv6_event_description,
+ ARRAY_SIZE(armv6_event_description));
}
/*
* ARMv7 detection
*/
else if (cpu_architecture() == CPU_ARCH_ARMv7) {
+ perf_event_add_events(&perf_events_arm, armv7_event_description,
+ ARRAY_SIZE(armv7_event_description));
/*
* Cortex-A8 detection
*/
@@ -1834,6 +2159,10 @@ init_hw_perf_events(void)
sizeof(armv7_a8_perf_cache_map));
armv7pmu.event_map = armv7_a8_pmu_event_map;
armpmu = &armv7pmu;
+
+ perf_event_add_events(&perf_events_arm,
+ cortexa8_event_description,
+ ARRAY_SIZE(cortexa8_event_description));
} else
/*
* Cortex-A9 detection
@@ -1846,8 +2175,12 @@ init_hw_perf_events(void)
sizeof(armv7_a9_perf_cache_map));
armv7pmu.event_map = armv7_a9_pmu_event_map;
armpmu = &armv7pmu;
- } else
- perf_max_events = -1;
+
+ perf_event_add_events(&perf_events_arm,
+ cortexa9_event_description,
+ ARRAY_SIZE(cortexa9_event_description));
+ } else
+ perf_max_events = -1;

if (armpmu) {
u32 nb_cnt;
@@ -1867,11 +2200,11 @@ init_hw_perf_events(void)
perf_max_events = -1;
}

- if (armpmu)
+ if (armpmu)
pr_info("enabled with %s PMU driver, %d counters available\n",
armpmu->name, armpmu->num_events);

- return 0;
+ return 0;
}
arch_initcall(init_hw_perf_events);

--
1.5.4.3

2010-01-20 09:17:13

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 2010-01-20 at 10:11 +0100, Tomasz Fujak wrote:
> Hi,
>
> While I managed to build and run the early version (back from
> December), I was unable to find the newest sources (infra + ARMv6,
> ARMv7 support).
> Where do I find them?
>
> The following patches provide a sysfs entry with hardware event human
> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
> (event_value, minval, maxval, name, description) and means to populate
> the file.
> The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in
> this matter.
>
> The intended use is twofold: for users to read the list directly and
> for tools (like perf).
>
> This series includes:
> [PATCH v1 1/2] perfevent: Add performance event structure definition
> and 'extevents' sysfs entry
> [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6,
> Cortex-A8 and Cortex-A9 exported

Why do this in kernel space? Listing available events seems like
something we can do from userspace just fine.

2010-01-20 09:21:57

by Tomasz Fujak

[permalink] [raw]
Subject: [PATCH v1 1/2] perfevent: Add performance event structure definition and 'extevents' sysfs entry

This patch adds a structure that contains single hardware performance event
definition (including name and description fields), and sysfs entry suited
to export machine-dependent list of events.

Signed-off-by: Tomasz Fujak <[email protected]>
Reviewed-by: Marek Szyprowski <[email protected]>
Reviewed-by: Kyungmin Park <[email protected]>

---
include/linux/perf_event.h | 19 +++++++++++++++++++
kernel/perf_event.c | 32 ++++++++++++++++++++++++++++++++
2 files changed, 51 insertions(+), 0 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 9e70126..4dc4d73 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -447,6 +447,12 @@ enum perf_callchain_context {

#define PERF_MAX_STACK_DEPTH 255

+#define PERF_EVENT_RAW_BIT (1ULL << 63)
+#define PERF_EVENT_RAW_TO_CONFIG(_val) ((_val) | PERF_EVENT_RAW_BIT)
+#define PERF_EVENT_CONFIG_TO_RAW(_val) ((_val) & ~PERF_EVENT_RAW_BIT)
+#define PERF_EVENT_IS_RAW(_val) ((_val) & PERF_EVENT_RAW_BIT)
+
+
struct perf_callchain_entry {
__u64 nr;
__u64 ip[PERF_MAX_STACK_DEPTH];
@@ -538,6 +544,19 @@ struct perf_mmap_data {
void *data_pages[0];
};

+struct perf_event_description {
+ struct list_head list;
+
+ /* type : 1, subsystem [0..7], id [56..63]*/
+ __u64 config;
+ __u64 min_value; /* min. wakeup period */
+ __u64 max_value; /* max. wakeup period */
+ __u32 flags; /* ??? */
+ __u32 reserved[3];
+ char *name;
+ char *description;
+};
+
struct perf_pending_entry {
struct perf_pending_entry *next;
void (*func)(struct perf_pending_entry *);
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 7f29643..4223870 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -97,6 +97,13 @@ void __weak hw_perf_enable(void) { barrier(); }
void __weak hw_perf_event_setup(int cpu) { barrier(); }
void __weak hw_perf_event_setup_online(int cpu) { barrier(); }

+static LIST_HEAD(perf_event_empty);
+
+const struct list_head __weak *hw_perf_event_get_list(void)
+{
+ return &perf_event_empty;
+}
+
int __weak
hw_perf_group_sched_in(struct perf_event *group_leader,
struct perf_cpu_context *cpuctx,
@@ -5097,6 +5104,23 @@ perf_set_overcommit(struct sysdev_class *class, const char *buf, size_t count)
return count;
}

+static ssize_t perf_show_extevents(struct sysdev_class *class, char *buf)
+{
+ char *str = buf;
+ const struct list_head *head = hw_perf_event_get_list();
+ const struct perf_event_description *entry;
+
+ list_for_each_entry(entry, head, list)
+ if (PERF_EVENT_IS_RAW(entry->config))
+ str += sprintf(str, "0x%llx\t%s\t%lld-%lld\t%s\n",
+ PERF_EVENT_CONFIG_TO_RAW(entry->config),
+ entry->name, entry->min_value,
+ entry->max_value, entry->description);
+
+ return str - buf;
+}
+
+
static SYSDEV_CLASS_ATTR(
reserve_percpu,
0644,
@@ -5111,9 +5135,17 @@ static SYSDEV_CLASS_ATTR(
perf_set_overcommit
);

+static SYSDEV_CLASS_ATTR(
+ extevents,
+ 0444,
+ perf_show_extevents,
+ NULL
+ );
+
static struct attribute *perfclass_attrs[] = {
&attr_reserve_percpu.attr,
&attr_overcommit.attr,
+ &attr_extevents.attr,
NULL
};

--
1.5.4.3

2010-01-20 09:47:36

by Tomasz Fujak

[permalink] [raw]
Subject: RE: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

> -----Original Message-----
> From: [email protected] [mailto:linux-arm-
> [email protected]] On Behalf Of Peter Zijlstra
> Sent: Wednesday, January 20, 2010 10:17 AM
> To: Tomasz Fujak
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected]
> Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event
> description in sysfs
>
> On Wed, 2010-01-20 at 10:11 +0100, Tomasz Fujak wrote:
> > Hi,
> >
> > While I managed to build and run the early version (back from
> > December), I was unable to find the newest sources (infra + ARMv6,
> > ARMv7 support).
> > Where do I find them?
> >
> > The following patches provide a sysfs entry with hardware event human
> > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
> > (event_value, minval, maxval, name, description) and means to
> populate
> > the file.
> > The version posted contains ARMv6, ARMv7 (Cortex-A[89]) support in
> > this matter.
> >
> > The intended use is twofold: for users to read the list directly and
> > for tools (like perf).
> >
> > This series includes:
> > [PATCH v1 1/2] perfevent: Add performance event structure definition
> > and 'extevents' sysfs entry
> > [PATCH v1 2/2] [ARM] perfevent: Event description list for ARMv6,
> > Cortex-A8 and Cortex-A9 exported
>
> Why do this in kernel space? Listing available events seems like
> something we can do from userspace just fine.

Sure we could, it's the other option. But it does not appeal to me. In case
of userspace tools (like the pref for which the above is meant) they'd need
to come with their own version of the list, which must match the host
platform. Right now the perf just forwards raw event number to the kernel
and that's it. Potentially it could bind a set of events supported to a
platform (how to detect which platform we execute on?). But how do we handle
different revisions and minor changes within a single platform?

That's why I think the kernel should expose supported events. At least with
an identifier suitable to unambiguously detect which HW defined event it is.

In the proposed approach I also provided a name a and description.
Right now if one wants to set a counter with some non-generic value, a
datasheet comes handy.
And Joe the average user does not necessarily know the detailed machine
he/she has, let alone the datasheet. With this approach the user is armed
with the event definition, which helps them go around outdated/unsupported
tools.

>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2010-01-20 09:58:06

by Michal Nazarewicz

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

>> The following patches provide a sysfs entry with hardware event human
>> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
>> (event_value, minval, maxval, name, description) and means to populate
>> the file.
>>
>> The intended use is twofold: for users to read the list directly and
>> for tools (like perf).

On Wed, 20 Jan 2010 10:16:39 +0100, Peter Zijlstra <[email protected]> wrote:
> Why do this in kernel space? Listing available events seems like
> something we can do from userspace just fine.

IMO kernel knows better what hardware it's running on and user space
should not care and if this list were to be kept in user space it
would have to detect the processor it's running on and act accordingly.

Also, keeping the list in user space could lead to different software
maintaining separate lists which would get out of sync. I think it's
easier to update a single list in kernel then wait till all the
software packages update theirs.

This also means that different tools would use different names and
descriptions for the events which would only increase confusion.

Moreover, since kernel already does the hard work of detecting CPU
it may provide a list as well.

But I'm just a humble coder, what do I know... ;)

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał "mina86" Nazarewicz (o o)
ooo +---[[email protected]]---[[email protected]]---ooO--(_)--Ooo--

2010-01-20 09:58:22

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 10:11:44AM +0100, Tomasz Fujak wrote:
> The following patches provide a sysfs entry with hardware event human
> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
> (event_value, minval, maxval, name, description)

I think your patch is in violation of this from
Documentation/filesystems/sysfs.txt:

Attributes
~~~~~~~~~
...
Attributes should be ASCII text files, preferably with only one value
per file. It is noted that it may not be efficient to contain only one
value per file, so it is socially acceptable to express an array of
values of the same type.

Mixing types, expressing multiple lines of data, and doing fancy
formatting of data is heavily frowned upon. Doing these things may get
you publically humiliated and your code rewritten without notice.

2010-01-20 10:23:00

by Tomasz Fujak

[permalink] [raw]
Subject: RE: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

> -----Original Message-----
> From: [email protected] [mailto:linux-arm-
> [email protected]] On Behalf Of Russell King - ARM
> Linux
> Sent: Wednesday, January 20, 2010 10:58 AM
> To: Tomasz Fujak
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event
> description in sysfs
>
> On Wed, Jan 20, 2010 at 10:11:44AM +0100, Tomasz Fujak wrote:
> > The following patches provide a sysfs entry with hardware event human
> > readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
> > (event_value, minval, maxval, name, description)
>
> I think your patch is in violation of this from
> Documentation/filesystems/sysfs.txt:
>
> Attributes
> ~~~~~~~~~
> ...
> Attributes should be ASCII text files, preferably with only one value
> per file. It is noted that it may not be efficient to contain only one
> value per file, so it is socially acceptable to express an array of
> values of the same type.
>
> Mixing types, expressing multiple lines of data, and doing fancy
> formatting of data is heavily frowned upon. Doing these things may get
> you publically humiliated and your code rewritten without notice.

1. There are numerous exceptions:
$ find /sys -exec grep -HC ^ {} \; 2>/dev/null | grep ":[3-9]$" | grep -c
yielded 43 on my machine.
Some of them list multiple lines with fancy formatting each (i.e.:
/sys/class/Bluetooth/l2cap or devices/pci*/resource)

2. There are sysfs entries regarding the performance counters already:
'overcommit' and 'reserve_percpu'
They are simple, I admit, but I find it useful to have all relevant thing in
one place.

If the above does not convince you, I could move the file to the debugfs.

>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

2010-01-20 13:31:53

by Jamie Iles

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 10:57:08AM +0100, Michał Nazarewicz wrote:
>>> The following patches provide a sysfs entry with hardware event human
>>> readable description in the form of "0x%llx\t%lld-%lld\t%s\t%s" %
>>> (event_value, minval, maxval, name, description) and means to populate
>>> the file.
>>>
>>> The intended use is twofold: for users to read the list directly and
>>> for tools (like perf).
>
> On Wed, 20 Jan 2010 10:16:39 +0100, Peter Zijlstra <[email protected]> wrote:
>> Why do this in kernel space? Listing available events seems like
>> something we can do from userspace just fine.
>
> IMO kernel knows better what hardware it's running on and user space
> should not care and if this list were to be kept in user space it
> would have to detect the processor it's running on and act accordingly.
>
> Also, keeping the list in user space could lead to different software
> maintaining separate lists which would get out of sync. I think it's
> easier to update a single list in kernel then wait till all the
> software packages update theirs.
>
> This also means that different tools would use different names and
> descriptions for the events which would only increase confusion.
Personally I think this is a good idea. At the moment 'perf list' gives lots
of events that the system isn't capable of counting. Admittedly it's fairly
easy to see if they are supported but it would be nice if the list reflected
the countable events. perf already does this for the tracing events so it
would be nice if it did the same for the hardware events. I guess the same
hierarchy would be nice too.

The main problem I can envisage is that different CPUs could use slightly
different names for the same event.

Jamie

2010-01-20 13:43:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 2010-01-20 at 13:31 +0000, Jamie Iles wrote:
> Personally I think this is a good idea. At the moment 'perf list' gives lots
> of events that the system isn't capable of counting. Admittedly it's fairly
> easy to see if they are supported but it would be nice if the list reflected
> the countable events. perf already does this for the tracing events so it
> would be nice if it did the same for the hardware events. I guess the same
> hierarchy would be nice too.

This seems to be missing the patch that extends perf list to report the
support and counting status for the events on the current machine :-)

Furthermore, /proc/cpuinfo should be enough information to come up with
an arch specific set of events to be translated into raw.

2010-01-20 13:56:48

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 02:39:39PM +0100, Peter Zijlstra wrote:
> Furthermore, /proc/cpuinfo should be enough information to come up with
> an arch specific set of events to be translated into raw.

Unfortunately, it isn't. CPU identification has become a fairly murky
business on ARM that the information exported from /proc/cpuinfo can
no longer precisely identify the CPU itself.

For example, we just treat Cortex A8 and A9 as "ARMv7" because from the
kernel's point of view, they're the same.

2010-01-20 14:02:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote:
>
> Unfortunately, it isn't. CPU identification has become a fairly murky
> business on ARM that the information exported from /proc/cpuinfo can
> no longer precisely identify the CPU itself.
>
> For example, we just treat Cortex A8 and A9 as "ARMv7" because from the
> kernel's point of view, they're the same.

Would it make sense to extend arm's cpuinfo to include enough
information so that userspace can indeed do this?

It seems to me userspace might care about the exact platform they're
running on.

2010-01-20 14:16:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 2010-01-20 at 15:09 +0100, Michał Nazarewicz wrote:
> On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <[email protected]> wrote:
> > It seems to me userspace might care about the exact platform they're
> > running on.
>
> In my humble opinion, user space should never care about platform it's
> running on. Interfaces provided by kernel should suffice to implement
> abstraction layer between user space and hardware. If we abandon that
> we're back in DOS times. But hey, again, that's just my opinion.

Well, you're completely right. But the often sad reality is that perfect
abstraction is either impossible or prohibitively expensive.

2010-01-20 14:20:53

by Michal Nazarewicz

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <[email protected]> wrote:
> It seems to me userspace might care about the exact platform they're
> running on.

In my humble opinion, user space should never care about platform it's
running on. Interfaces provided by kernel should suffice to implement
abstraction layer between user space and hardware. If we abandon that
we're back in DOS times. But hey, again, that's just my opinion.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał "mina86" Nazarewicz (o o)
ooo +---[[email protected]]---[[email protected]]---ooO--(_)--Ooo--

2010-01-20 14:28:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, 2010-01-20 at 15:16 +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-20 at 15:09 +0100, Michał Nazarewicz wrote:
> > On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <[email protected]> wrote:
> > > It seems to me userspace might care about the exact platform they're
> > > running on.
> >
> > In my humble opinion, user space should never care about platform it's
> > running on. Interfaces provided by kernel should suffice to implement
> > abstraction layer between user space and hardware. If we abandon that
> > we're back in DOS times. But hey, again, that's just my opinion.
>
> Well, you're completely right. But the often sad reality is that perfect
> abstraction is either impossible or prohibitively expensive.

And then there is the simple matter of knowing what kind of box it is
without having to resort to a screwdriver or worse.

2010-01-20 14:42:24

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 03:01:20PM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote:
> >
> > Unfortunately, it isn't. CPU identification has become a fairly murky
> > business on ARM that the information exported from /proc/cpuinfo can
> > no longer precisely identify the CPU itself.
> >
> > For example, we just treat Cortex A8 and A9 as "ARMv7" because from the
> > kernel's point of view, they're the same.
>
> Would it make sense to extend arm's cpuinfo to include enough
> information so that userspace can indeed do this?

The idea that "I'm running on a Cortex A9" is no longer provided by the
new CPU ID scheme. Instead, what's now provided is a set of registers
which describe various individual features of the CPU:

- ThumbEE ISA level, Jazelle ISA level, Thumb ISA level, ARM ISA level.
- Programmer model (not much here that userspace would be interested in)
- Debug model (memory mapped/co-processor, v6 debug architecture, v7 debug
architecture.)
- Four 32-bit registers describing the memory model.

Note that pre-ARMv6k does not provide this information. Plus, the
interpretation of these registers change between ARMv6k and ARMv7 -
and I wouldn't be surprised if the interpretation changes in the
future - just like the 'cache type' register completely changed format
on ARMv7.

> It seems to me userspace might care about the exact platform they're
> running on.

It may wanted to care at one time, but as time goes on, knowing what
the high-level chip is will be come irrelevent, and is actually the
wrong question.

The real questions that userspace needs to ask are the specific ones,
such as "what ARM ISA level is supported? what Thumb ISA level is
supported? what debug model is implemented?"

Given that history has shown that identification schemes on ARM change
in extremely annoying ways, I don't think decoding these registers to
some kind of textual representation for /proc/cpuinfo is the right
approach. It might instead make more sense to just export the entire
set of CPU ID registers to userspace, and let userspace grapple with
the complexities of decoding the information it wants from them.

2010-01-20 14:46:41

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 03:26:49PM +0100, Peter Zijlstra wrote:
> On Wed, 2010-01-20 at 15:16 +0100, Peter Zijlstra wrote:
> > On Wed, 2010-01-20 at 15:09 +0100, Michał Nazarewicz wrote:
> > > On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <[email protected]> wrote:
> > > > It seems to me userspace might care about the exact platform they're
> > > > running on.
> > >
> > > In my humble opinion, user space should never care about platform it's
> > > running on. Interfaces provided by kernel should suffice to implement
> > > abstraction layer between user space and hardware. If we abandon that
> > > we're back in DOS times. But hey, again, that's just my opinion.
> >
> > Well, you're completely right. But the often sad reality is that perfect
> > abstraction is either impossible or prohibitively expensive.
>
> And then there is the simple matter of knowing what kind of box it is
> without having to resort to a screwdriver or worse.

If you're expecting the CPU to tell you that, give up now. The CPU
will tell you about the CPU core, not the SoC.

All SoCs that have an ARM926 core in report that they are an ARM926
CPU; that doesn't tell you that the surrounding hardware is an Atmel
SoC, Samsung SoC, etc.

Even some buggy CPUs which aren't an ARM926 report themselves as an
ARM926 (Feroceon) while being incompatible with the ARM926 on several
levels. (Apparantly, the argument being that they wanted ARM926
software to run on Feroceon, or something like that.)

That's why we have the value passed in from the boot loader; there's
no other way to tell what SoC you're running on.

2010-01-20 14:55:16

by Michal Nazarewicz

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

>> On Wed, 20 Jan 2010 15:01:20 +0100, Peter Zijlstra <[email protected]> wrote:
>>> It seems to me userspace might care about the exact platform they're
>>> running on.

> On Wed, 2010-01-20 at 15:09 +0100, Michał Nazarewicz wrote:
>> In my humble opinion, user space should never care about platform it's
>> running on. Interfaces provided by kernel should suffice to implement
>> abstraction layer between user space and hardware. If we abandon that
>> we're back in DOS times. But hey, again, that's just my opinion.

On Wed, 20 Jan 2010 15:16:19 +0100, Peter Zijlstra <[email protected]> wrote:
> Well, you're completely right. But the often sad reality is that perfect
> abstraction is either impossible or prohibitively expensive.

Yes, I agree and am aware of that, but I think it's not the case with
performance events. It is possible for kernel to provide such a list
and at the same time it's not that expensive (it's a matter of hardcoding
a list in the source and possibly alter it a bit according to hardware
detection which is done anyway).

Of course, it's not all gold -- maintaining such a list increases
complexity of the kernel and adds burden of keeping the lists in
sync with reality.

Still, however, in my opinion, the advantages of the list maintained
in kernel are greater then disadvantages and so I'd opt in for that
solution. (Of course, I'm not some kind of ARM Linux guru so I may
be simply wrong.)

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michał "mina86" Nazarewicz (o o)
ooo +---[[email protected]]---[[email protected]]---ooO--(_)--Ooo--

2010-01-20 15:02:39

by Jamie Iles

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 02:41:40PM +0000, Russell King - ARM Linux wrote:
> Given that history has shown that identification schemes on ARM change
> in extremely annoying ways, I don't think decoding these registers to
> some kind of textual representation for /proc/cpuinfo is the right
> approach. It might instead make more sense to just export the entire
> set of CPU ID registers to userspace, and let userspace grapple with
> the complexities of decoding the information it wants from them.
Yes, this would probably be the best generic solution, but in the specific
case of ARM perfevents, the kernel code already has to decode some of the CPU
ID registers to work out what set of events to use. Why make userspace do all
of this decoding again? The x86 code sets up the x86_pmu depending on CPU type
so this is doing a similar thing (although it is easier for x86).

Having perf do all of this decoding for all of the supported CPU types when
the kernel has already done it once and maintaining 2 sets of event lists
seems a bit fiddly compared to simply exporting the supported events from the
kernel...

Jamie

2010-01-20 15:44:00

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 03:03:03PM +0000, Jamie Iles wrote:
> On Wed, Jan 20, 2010 at 02:41:40PM +0000, Russell King - ARM Linux wrote:
> > Given that history has shown that identification schemes on ARM change
> > in extremely annoying ways, I don't think decoding these registers to
> > some kind of textual representation for /proc/cpuinfo is the right
> > approach. It might instead make more sense to just export the entire
> > set of CPU ID registers to userspace, and let userspace grapple with
> > the complexities of decoding the information it wants from them.
> Yes, this would probably be the best generic solution, but in the specific
> case of ARM perfevents, the kernel code already has to decode some of the CPU
> ID registers to work out what set of events to use. Why make userspace do all
> of this decoding again? The x86 code sets up the x86_pmu depending on CPU type
> so this is doing a similar thing (although it is easier for x86).

If you're referring to reading the main CPU ID register and relying
on the part number telling you what CPU you're running on, that's
unreliable if you're only checking the part number - you at least
need to check the implementer.

If you want to do ID checking via the main ID register, there are
some clashes even if you take the implementer field into account.

2010-01-20 16:18:29

by Jamie Iles

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 03:42:50PM +0000, Russell King - ARM Linux wrote:
> If you're referring to reading the main CPU ID register and relying
> on the part number telling you what CPU you're running on, that's
> unreliable if you're only checking the part number - you at least
> need to check the implementer.
>
> If you want to do ID checking via the main ID register, there are
> some clashes even if you take the implementer field into account.
Ok, so for the kernel based code I should check the implementer and part
number then. For now we can make sure that the implementor is ARM and add
others if they have compatible PMUs and hope that there aren't any clashes
with nasty side effects.

Jamie

2010-01-20 16:27:15

by Jamie Lokier

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

Russell King - ARM Linux wrote:
> On Wed, Jan 20, 2010 at 03:01:20PM +0100, Peter Zijlstra wrote:
> > On Wed, 2010-01-20 at 13:55 +0000, Russell King - ARM Linux wrote:
> > >
> > > Unfortunately, it isn't. CPU identification has become a fairly murky
> > > business on ARM that the information exported from /proc/cpuinfo can
> > > no longer precisely identify the CPU itself.
> > >
> > > For example, we just treat Cortex A8 and A9 as "ARMv7" because from the
> > > kernel's point of view, they're the same.
> >
> > Would it make sense to extend arm's cpuinfo to include enough
> > information so that userspace can indeed do this?
>
> The idea that "I'm running on a Cortex A9" is no longer provided by the
> new CPU ID scheme. Instead, what's now provided is a set of registers
> which describe various individual features of the CPU:
>
> - ThumbEE ISA level, Jazelle ISA level, Thumb ISA level, ARM ISA level.
> - Programmer model (not much here that userspace would be interested in)
> - Debug model (memory mapped/co-processor, v6 debug architecture, v7 debug
> architecture.)
> - Four 32-bit registers describing the memory model.
>
> Note that pre-ARMv6k does not provide this information. Plus, the
> interpretation of these registers change between ARMv6k and ARMv7 -
> and I wouldn't be surprised if the interpretation changes in the
> future - just like the 'cache type' register completely changed format
> on ARMv7.
>
> > It seems to me userspace might care about the exact platform they're
> > running on.
>
> It may wanted to care at one time, but as time goes on, knowing what
> the high-level chip is will be come irrelevent, and is actually the
> wrong question.
>
> The real questions that userspace needs to ask are the specific ones,
> such as "what ARM ISA level is supported? what Thumb ISA level is
> supported? what debug model is implemented?"
>
> Given that history has shown that identification schemes on ARM change
> in extremely annoying ways, I don't think decoding these registers to
> some kind of textual representation for /proc/cpuinfo is the right
> approach. It might instead make more sense to just export the entire
> set of CPU ID registers to userspace, and let userspace grapple with
> the complexities of decoding the information it wants from them.

In practice, the list of capabilities works well on x86 in /proc/cpuinfo:

flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm

They are based on the feature bits from the CPU's cpuid instruction,
but the kernel does things like apply errata quirks to remove bits
that don't work on a particular implementation and show the lowest common
denominator when there are multiple CPUs.

Userspace tends to look for features it cares about (e.g. sse means
sse instructions are available), and doesn't need to know anything
about murky details of different CPUs.

Many of the features aren't relevant to userspace; the rest tend to
indicate the presence of particular instructions.

On ARM, it would be great to have a simple set of features in
/proc/cpuinfo indicating which instruction sets are available (and
reliable).

-- Jamie

2010-01-20 16:36:19

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH/RFC v1 0/2] Human readable performance event description in sysfs

On Wed, Jan 20, 2010 at 04:26:47PM +0000, Jamie Lokier wrote:
> In practice, the list of capabilities works well on x86 in /proc/cpuinfo:
>
> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon bts pni monitor vmx est tm2 xtpr pdcm
>
> They are based on the feature bits from the CPU's cpuid instruction,
> but the kernel does things like apply errata quirks to remove bits
> that don't work on a particular implementation and show the lowest common
> denominator when there are multiple CPUs.

You're assuming that there's a fixed set of feature bits on ARM. There
aren't.

What you have is a main ID register up until ARMv6, which has about
four different encodings. On some CPUs, this is the only ID register
offered, and within that subset, some different CPUs (eg, implemented
by different manufacturers, or indeed the same manufacturer) have the
same ID register value, despite being rather different.

>From ARMv6k and later, we have a different ID scheme, where we have
about 10 32-bit registers giving detailed information about various
aspects of the CPU - including five 32-bit registers for details about
the instruction set.

We know that some of the meanings of these registers has changed their
meaning - and I don't think there's a way to identify which meaning
should be applied to the registers (it seems to require reading lots
of different documents to sort out what CPUs implement which method.)

Frankly, it's a mess, and when you look at implementations, it turns out
to be unreliable.

> On ARM, it would be great to have a simple set of features in
> /proc/cpuinfo indicating which instruction sets are available (and
> reliable).

I think you've living in a dream world there.