This series addresses a few issues with how the MIPS performance
counters code supports the hardware multithreading MT ASE.
Firstly, implementations of the MT ASE may implement performance
counters
per core or per thread(TC). MIPS Techologies implementations signal this
via a bit in the implmentation specific CONFIG7 register. Since this
register is implementation specific, checking it should be guarded by a
PRID check. This also replaces a bit defined by a magic number.
Secondly, the code currently uses vpe_id(), defined as
smp_processor_id(), divided by 2, to share core performance counters
between VPEs. This relies on a couple of assumptions of the hardware
implementation to function correctly (always 2 VPEs per core, and the
hardware reading only the least significant bit).
Finally, the method of sharing core performance counters between VPEs is
suboptimal since it does not allow one process running on a VPE to use
all of the performance counters available to it, because the kernel will
reserve half of the coutners for the other VPE even if it may never use
them. This reservation at counter probe is replaced with an allocation
on use strategy.
Tested on a MIPS Creator CI40 (2C2T MIPS InterAptiv with per-TC
counters, though for the purposes of testing the per-TC availability was
hardcoded to allow testing both paths).
Series applies to v4.16
Changes in v3:
New patch to detect feature presence in cpu-probe.c
Use flag in cpu_data set by cpu_probe.c to indicate feature presence.
- rebase on new feature detection
Changes in v2:
Fix mipsxx_pmu_enable_event for !#ifdef CONFIG_MIPS_MT_SMP
- Fix !#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS build
- re-use cpuc variable in mipsxx_pmu_alloc_counter,
mipsxx_pmu_free_counter rather than having sibling_ version.
Since BMIPS5000 does not implement per TC counters, we can remove the
check on cpu_has_mipsmt_pertccounters.
New patch to fix BMIPS5000 system mode perf.
Matt Redfearn (7):
MIPS: Probe for MIPS MT perf counters per TC
MIPS: perf: More robustly probe for the presence of per-tc counters
MIPS: perf: Use correct VPE ID when setting up VPE tracing
MIPS: perf: Fix perf with MT counting other threads
MIPS: perf: Allocate per-core counters on demand
MIPS: perf: Fold vpe_id() macro into it's one last usage
MIPS: perf: Fix BMIPS5000 system mode counting
arch/mips/include/asm/cpu-features.h | 7 ++
arch/mips/include/asm/cpu.h | 2 +
arch/mips/include/asm/mipsregs.h | 6 +
arch/mips/kernel/cpu-probe.c | 12 ++
arch/mips/kernel/perf_event_mipsxx.c | 232 +++++++++++++++++++----------------
arch/mips/oprofile/op_model_mipsxx.c | 2 -
6 files changed, 150 insertions(+), 111 deletions(-)
--
2.7.4
Processors implementing the MIPS MT ASE may have performance counters
implemented per core or per TC. Processors implemented by MIPS
Technologies signify presence per TC through a bit in the implementation
specific Config7 register. Currently the code which probes for their
presence blindly reads a magic number corresponding to this bit, despite
it potentially having a different meaning in the CPU implementation.
Since CPU features are generally detected by cpu-probe.c, perform the
detection here instead. Introduce cpu_set_mt_per_tc_perf which checks
the bit in config7 and call it from MIPS CPUs known to implement this
bit and the MT ASE, specifically, the 34K, 1004K and interAptiv.
Once the presence of the per-tc counter is indicated in cpu_data, tests
for it can be updated to use this flag.
Suggested-by: James Hogan <[email protected]>
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3:
New patch to detect feature presence in cpu-probe.c
Changes in v2: None
arch/mips/include/asm/cpu.h | 2 ++
arch/mips/include/asm/mipsregs.h | 5 +++++
arch/mips/kernel/cpu-probe.c | 12 ++++++++++++
3 files changed, 19 insertions(+)
diff --git a/arch/mips/include/asm/cpu.h b/arch/mips/include/asm/cpu.h
index d39324c4adf1..5b9d02ef4f60 100644
--- a/arch/mips/include/asm/cpu.h
+++ b/arch/mips/include/asm/cpu.h
@@ -418,6 +418,8 @@ enum cpu_type_enum {
MBIT_ULL(54) /* CPU shares FTLB RAM with another */
#define MIPS_CPU_SHARED_FTLB_ENTRIES \
MBIT_ULL(55) /* CPU shares FTLB entries with another */
+#define MIPS_CPU_MT_PER_TC_PERF_COUNTERS \
+ MBIT_ULL(56) /* CPU has perf counters implemented per TC (MIPSMT ASE) */
/*
* CPU ASE encodings
diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index 858752dac337..a4baaaa02bc8 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -684,6 +684,11 @@
#define MIPS_CONF7_IAR (_ULCAST_(1) << 10)
#define MIPS_CONF7_AR (_ULCAST_(1) << 16)
+/* Config7 Bits specific to MIPS Technologies. */
+
+/* Performance counters implemented Per TC */
+#define MTI_CONF7_PTC (_ULCAST_(1) << 19)
+
/* WatchLo* register definitions */
#define MIPS_WATCHLO_IRW (_ULCAST_(0x7) << 0)
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index cf3fd549e16d..1241c2a23d90 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -414,6 +414,14 @@ static int __init ftlb_disable(char *s)
__setup("noftlb", ftlb_disable);
+/*
+ * Check if the CPU has per tc perf counters
+ */
+static inline void cpu_set_mt_per_tc_perf(struct cpuinfo_mips *c)
+{
+ if (read_c0_config7() & MTI_CONF7_PTC)
+ c->options |= MIPS_CPU_MT_PER_TC_PERF_COUNTERS;
+}
static inline void check_errata(void)
{
@@ -1569,6 +1577,7 @@ static inline void cpu_probe_mips(struct cpuinfo_mips *c, unsigned int cpu)
c->cputype = CPU_34K;
c->writecombine = _CACHE_UNCACHED;
__cpu_name[cpu] = "MIPS 34Kc";
+ cpu_set_mt_per_tc_perf(c);
break;
case PRID_IMP_74K:
c->cputype = CPU_74K;
@@ -1589,6 +1598,7 @@ static inline void cpu_probe_mips(struct cpuinfo_mips *c, unsigned int cpu)
c->cputype = CPU_1004K;
c->writecombine = _CACHE_UNCACHED;
__cpu_name[cpu] = "MIPS 1004Kc";
+ cpu_set_mt_per_tc_perf(c);
break;
case PRID_IMP_1074K:
c->cputype = CPU_1074K;
@@ -1598,10 +1608,12 @@ static inline void cpu_probe_mips(struct cpuinfo_mips *c, unsigned int cpu)
case PRID_IMP_INTERAPTIV_UP:
c->cputype = CPU_INTERAPTIV;
__cpu_name[cpu] = "MIPS interAptiv";
+ cpu_set_mt_per_tc_perf(c);
break;
case PRID_IMP_INTERAPTIV_MP:
c->cputype = CPU_INTERAPTIV;
__cpu_name[cpu] = "MIPS interAptiv (multi)";
+ cpu_set_mt_per_tc_perf(c);
break;
case PRID_IMP_PROAPTIV_UP:
c->cputype = CPU_PROAPTIV;
--
2.7.4
There are a couple of FIXME's in the perf code which state that
cpu_data[event->cpu].vpe_id reports 0 for both CPUs. This is no longer
the case, since the vpe_id is used extensively by SMP CPS.
VPE local counting gets around this by using smp_processor_id() instead.
As it happens this does work correctly to count events on the right VPE,
but relies on 2 assumptions:
a) Always having 2 VPEs / core.
b) The hardware only paying attention to the least significant bit of
the PERFCTL.VPEID field.
If either of these assumptions change then the incorrect VPEs events
will be counted.
Fix this by replacing smp_processor_id() with
cpu_vpe_id(¤t_cpu_data), in the vpe_id() macro, and pass vpe_id()
to M_PERFCTL_VPEID() when setting up PERFCTL.VPEID. The FIXME's can also
be removed since they no longer apply.
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3: None
Changes in v2: None
arch/mips/kernel/perf_event_mipsxx.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index 0595a974bc81..7e2b7d38a774 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -135,12 +135,8 @@ static DEFINE_RWLOCK(pmuint_rwlock);
#define vpe_id() (cpu_has_mipsmt_pertccounters ? \
0 : (smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK))
#else
-/*
- * FIXME: For VSMP, vpe_id() is redefined for Perf-events, because
- * cpu_data[cpuid].vpe_id reports 0 for _both_ CPUs.
- */
#define vpe_id() (cpu_has_mipsmt_pertccounters ? \
- 0 : smp_processor_id())
+ 0 : cpu_vpe_id(¤t_cpu_data))
#endif
/* Copied from op_model_mipsxx.c */
@@ -1277,11 +1273,7 @@ static void check_and_calc_range(struct perf_event *event,
*/
hwc->config_base |= M_TC_EN_ALL;
} else {
- /*
- * FIXME: cpu_data[event->cpu].vpe_id reports 0
- * for both CPUs.
- */
- hwc->config_base |= M_PERFCTL_VPEID(event->cpu);
+ hwc->config_base |= M_PERFCTL_VPEID(vpe_id());
hwc->config_base |= M_TC_EN_VPE;
}
} else
--
2.7.4
When perf is used in non-system mode, i.e. without specifying CPUs to
count on, check_and_calc_range falls into the case when it sets
M_TC_EN_ALL in the counter config_base. This has the impact of always
counting for all of the threads in a core, even when the user has not
requested it. For example this can be seen with a test program which
executes 30002 instructions and 10000 branches running on one VPE and a
busy load on the other VPE in the core. Without this commit, the
expected count is not returned:
taskset 4 dd if=/dev/zero of=/dev/null count=100000 & taskset 8 perf
stat -e instructions:u,branches:u ./test_prog
Performance counter stats for './test_prog':
103235 instructions:u
17015 branches:u
In order to fix this, remove check_and_calc_range entirely and perform
all of the logic in mipsxx_pmu_enable_event. Since
mipsxx_pmu_enable_event now requires the range of the event, ensure that
it is set by mipspmu_perf_event_encode in the same circumstances as
before (i.e. #ifdef CONFIG_MIPS_MT_SMP && num_possible_cpus() > 1).
The logic of mipsxx_pmu_enable_event now becomes:
If the CPU is a BMIPS5000, then use the special vpe_id() implementation
to select which VPE to count.
If the counter has a range greater than a single VPE, i.e. it is a
core-wide counter, then ensure that the counter is set up to count
events from all TCs (though, since this is true by definition, is this
necessary? Just enabling a core-wide counter in the per-VPE case appears
experimentally to return the same counts. This is left in for now as the
logic was present before).
If the event is set up to count a particular CPU (i.e. system mode),
then the VPE ID of that CPU is used for the counter.
Otherwise, the event should be counted on the CPU scheduling this thread
(this was the critical bit missing from the previous implementation) so
the VPE ID of this CPU is used for the counter.
With this commit, the same test as before returns the counts expected:
taskset 4 dd if=/dev/zero of=/dev/null count=100000 & taskset 8 perf
stat -e instructions:u,branches:u ./test_prog
Performance counter stats for './test_prog':
30002 instructions:u
10000 branches:u
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3: None
Changes in v2:
Fix mipsxx_pmu_enable_event for !#ifdef CONFIG_MIPS_MT_SMP
arch/mips/kernel/perf_event_mipsxx.c | 78 ++++++++++++++++++------------------
1 file changed, 39 insertions(+), 39 deletions(-)
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index 7e2b7d38a774..fe50986e83c6 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -323,7 +323,11 @@ static int mipsxx_pmu_alloc_counter(struct cpu_hw_events *cpuc,
static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
{
+ struct perf_event *event = container_of(evt, struct perf_event, hw);
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+#ifdef CONFIG_MIPS_MT_SMP
+ unsigned int range = evt->event_base >> 24;
+#endif /* CONFIG_MIPS_MT_SMP */
WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
@@ -331,11 +335,37 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
(evt->config_base & M_PERFCTL_CONFIG_MASK) |
/* Make sure interrupt enabled. */
MIPS_PERFCTRL_IE;
- if (IS_ENABLED(CONFIG_CPU_BMIPS5000))
+
+#ifdef CONFIG_CPU_BMIPS5000
+ {
/* enable the counter for the calling thread */
cpuc->saved_ctrl[idx] |=
(1 << (12 + vpe_id())) | BRCM_PERFCTRL_TC;
+ }
+#else
+#ifdef CONFIG_MIPS_MT_SMP
+ if (range > V) {
+ /* The counter is processor wide. Set it up to count all TCs. */
+ pr_debug("Enabling perf counter for all TCs\n");
+ cpuc->saved_ctrl[idx] |= M_TC_EN_ALL;
+ } else
+#endif /* CONFIG_MIPS_MT_SMP */
+ {
+ unsigned int cpu, ctrl;
+ /*
+ * Set up the counter for a particular CPU when event->cpu is
+ * a valid CPU number. Otherwise set up the counter for the CPU
+ * scheduling this thread.
+ */
+ cpu = (event->cpu >= 0) ? event->cpu : smp_processor_id();
+
+ ctrl = M_PERFCTL_VPEID(cpu_vpe_id(&cpu_data[cpu]));
+ ctrl |= M_TC_EN_VPE;
+ cpuc->saved_ctrl[idx] |= ctrl;
+ pr_debug("Enabling perf counter for CPU%d\n", cpu);
+ }
+#endif /* CONFIG_CPU_BMIPS5000 */
/*
* We do not actually let the counter run. Leave it until start().
*/
@@ -649,13 +679,14 @@ static unsigned int mipspmu_perf_event_encode(const struct mips_perf_event *pev)
* event_id.
*/
#ifdef CONFIG_MIPS_MT_SMP
- return ((unsigned int)pev->range << 24) |
- (pev->cntr_mask & 0xffff00) |
- (pev->event_id & 0xff);
-#else
- return (pev->cntr_mask & 0xffff00) |
- (pev->event_id & 0xff);
-#endif
+ if (num_possible_cpus() > 1)
+ return ((unsigned int)pev->range << 24) |
+ (pev->cntr_mask & 0xffff00) |
+ (pev->event_id & 0xff);
+ else
+#endif /* CONFIG_MIPS_MT_SMP */
+ return ((pev->cntr_mask & 0xffff00) |
+ (pev->event_id & 0xff));
}
static const struct mips_perf_event *mipspmu_map_general_event(int idx)
@@ -1259,33 +1290,6 @@ static const struct mips_perf_event xlp_cache_map
},
};
-#ifdef CONFIG_MIPS_MT_SMP
-static void check_and_calc_range(struct perf_event *event,
- const struct mips_perf_event *pev)
-{
- struct hw_perf_event *hwc = &event->hw;
-
- if (event->cpu >= 0) {
- if (pev->range > V) {
- /*
- * The user selected an event that is processor
- * wide, while expecting it to be VPE wide.
- */
- hwc->config_base |= M_TC_EN_ALL;
- } else {
- hwc->config_base |= M_PERFCTL_VPEID(vpe_id());
- hwc->config_base |= M_TC_EN_VPE;
- }
- } else
- hwc->config_base |= M_TC_EN_ALL;
-}
-#else
-static void check_and_calc_range(struct perf_event *event,
- const struct mips_perf_event *pev)
-{
-}
-#endif
-
static int __hw_perf_event_init(struct perf_event *event)
{
struct perf_event_attr *attr = &event->attr;
@@ -1321,10 +1325,6 @@ static int __hw_perf_event_init(struct perf_event *event)
*/
hwc->config_base = MIPS_PERFCTRL_IE;
- /* Calculate range bits and validate it. */
- if (num_possible_cpus() > 1)
- check_and_calc_range(event, pev);
-
hwc->event_base = mipspmu_perf_event_encode(pev);
if (PERF_TYPE_RAW == event->attr.type)
mutex_unlock(&raw_event_mutex);
--
2.7.4
Previously when performance counters are per-core, rather than
per-thread, the number available were divided by 2 on detection, and the
counters used by each thread in a core were "swizzled" to ensure
separation. However, this solution is suboptimal since it relies on a
couple of assumptions:
a) Always having 2 VPEs / core (number of counters was divided by 2)
b) Always having a number of counters implemented in the core that is
divisible by 2. For instance if an SoC implementation had a single
counter and 2 VPEs per core, then this logic would fail and no
performance counters would be available.
The mechanism also does not allow for one VPE in a core using more than
it's allocation of the per-core counters to count multiple events even
though other VPEs may not be using them.
Fix this situation by instead allocating (and releasing) per-core
performance counters when they are requested. This approach removes the
above assumptions and fixes the shortcomings.
In order to do this:
Add additional logic to mipsxx_pmu_alloc_counter() to detect if a
sibling is using a per-core counter, and to allocate a per-core counter
in all sibling CPUs.
Similarly, add a mipsxx_pmu_free_counter() function to release a
per-core counter in all sibling CPUs when it is finished with.
A new spinlock, core_counters_lock, is introduced to ensure exclusivity
when allocating and releasing per-core counters.
Since counters are now allocated per-core on demand, rather than being
reserved per-thread at boot, all of the "swizzling" of counters is
removed.
The upshot is that in an SoC with 2 counters / thread, counters are
reported as:
Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
available to each CPU, irq 18
Running an instance of a test program on each of 2 threads in a
core, both threads can use their 2 counters to count 2 events:
taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
perf stat -e instructions:u,branches:u ./test_prog
Performance counter stats for './test_prog':
30002 instructions:u
10000 branches:u
0.005164264 seconds time elapsed
Performance counter stats for './test_prog':
30002 instructions:u
10000 branches:u
0.006139975 seconds time elapsed
In an SoC with 2 counters / core (which can be forced by setting
cpu_has_mipsmt_pertccounters = 0), counters are reported as:
Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
available to each core, irq 18
Running an instance of a test program on each of 2 threads in a
core, now only one thread manages to secure the performance counters to
count 2 events. The other thread does not get any counters.
taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
perf stat -e instructions:u,branches:u ./test_prog
Performance counter stats for './test_prog':
<not counted> instructions:u
<not counted> branches:u
0.005179533 seconds time elapsed
Performance counter stats for './test_prog':
30002 instructions:u
10000 branches:u
0.005179467 seconds time elapsed
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3:
- rebase on new feature detection
Changes in v2:
- Fix !#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS build
- re-use cpuc variable in mipsxx_pmu_alloc_counter,
mipsxx_pmu_free_counter rather than having sibling_ version.
arch/mips/kernel/perf_event_mipsxx.c | 130 +++++++++++++++++++++++------------
1 file changed, 85 insertions(+), 45 deletions(-)
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index fe50986e83c6..a07777aa1b79 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -129,6 +129,8 @@ static struct mips_pmu mipspmu;
#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
+static DEFINE_SPINLOCK(core_counters_lock);
+
static DEFINE_RWLOCK(pmuint_rwlock);
#if defined(CONFIG_CPU_BMIPS5000)
@@ -139,20 +141,6 @@ static DEFINE_RWLOCK(pmuint_rwlock);
0 : cpu_vpe_id(¤t_cpu_data))
#endif
-/* Copied from op_model_mipsxx.c */
-static unsigned int vpe_shift(void)
-{
- if (num_possible_cpus() > 1)
- return 1;
-
- return 0;
-}
-
-static unsigned int counters_total_to_per_cpu(unsigned int counters)
-{
- return counters >> vpe_shift();
-}
-
#else /* !CONFIG_MIPS_PERF_SHARED_TC_COUNTERS */
#define vpe_id() 0
@@ -163,17 +151,8 @@ static void pause_local_counters(void);
static irqreturn_t mipsxx_pmu_handle_irq(int, void *);
static int mipsxx_pmu_handle_shared_irq(void);
-static unsigned int mipsxx_pmu_swizzle_perf_idx(unsigned int idx)
-{
- if (vpe_id() == 1)
- idx = (idx + 2) & 3;
- return idx;
-}
-
static u64 mipsxx_pmu_read_counter(unsigned int idx)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
/*
@@ -195,8 +174,6 @@ static u64 mipsxx_pmu_read_counter(unsigned int idx)
static u64 mipsxx_pmu_read_counter_64(unsigned int idx)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
return read_c0_perfcntr0_64();
@@ -214,8 +191,6 @@ static u64 mipsxx_pmu_read_counter_64(unsigned int idx)
static void mipsxx_pmu_write_counter(unsigned int idx, u64 val)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
write_c0_perfcntr0(val);
@@ -234,8 +209,6 @@ static void mipsxx_pmu_write_counter(unsigned int idx, u64 val)
static void mipsxx_pmu_write_counter_64(unsigned int idx, u64 val)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
write_c0_perfcntr0_64(val);
@@ -254,8 +227,6 @@ static void mipsxx_pmu_write_counter_64(unsigned int idx, u64 val)
static unsigned int mipsxx_pmu_read_control(unsigned int idx)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
return read_c0_perfctrl0();
@@ -273,8 +244,6 @@ static unsigned int mipsxx_pmu_read_control(unsigned int idx)
static void mipsxx_pmu_write_control(unsigned int idx, unsigned int val)
{
- idx = mipsxx_pmu_swizzle_perf_idx(idx);
-
switch (idx) {
case 0:
write_c0_perfctrl0(val);
@@ -294,7 +263,7 @@ static void mipsxx_pmu_write_control(unsigned int idx, unsigned int val)
static int mipsxx_pmu_alloc_counter(struct cpu_hw_events *cpuc,
struct hw_perf_event *hwc)
{
- int i;
+ int i, cpu = smp_processor_id();
/*
* We only need to care the counter mask. The range has been
@@ -313,14 +282,85 @@ static int mipsxx_pmu_alloc_counter(struct cpu_hw_events *cpuc,
* they can be dynamically swapped, they both feel happy.
* But here we leave this issue alone for now.
*/
- if (test_bit(i, &cntr_mask) &&
- !test_and_set_bit(i, cpuc->used_mask))
+ if (!test_bit(i, &cntr_mask))
+ continue;
+
+#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
+ /*
+ * When counters are per-core, check for use and allocate
+ * them in all sibling CPUs.
+ */
+ if (!cpu_has_mipsmt_pertccounters) {
+ int sibling_cpu, allocated = 0;
+ unsigned long flags;
+
+ spin_lock_irqsave(&core_counters_lock, flags);
+
+ for_each_cpu(sibling_cpu, &cpu_sibling_map[cpu]) {
+ cpuc = per_cpu_ptr(&cpu_hw_events, sibling_cpu);
+
+ if (test_bit(i, cpuc->used_mask)) {
+ pr_debug("CPU%d already using core counter %d\n",
+ sibling_cpu, i);
+ goto next_counter;
+ }
+ }
+
+ /* Counter i is not used by any siblings - use it */
+ allocated = 1;
+ for_each_cpu(sibling_cpu, &cpu_sibling_map[cpu]) {
+ cpuc = per_cpu_ptr(&cpu_hw_events, sibling_cpu);
+
+ set_bit(i, cpuc->used_mask);
+ /* sibling is using the counter */
+ pr_debug("CPU%d now using core counter %d\n",
+ sibling_cpu, i);
+ }
+next_counter:
+ spin_unlock_irqrestore(&core_counters_lock, flags);
+ if (allocated)
+ return i;
+ }
+ else
+#endif
+ if (!test_and_set_bit(i, cpuc->used_mask)) {
+ pr_debug("CPU%d now using counter %d\n", cpu, i);
return i;
+ }
}
return -EAGAIN;
}
+static void mipsxx_pmu_free_counter(struct cpu_hw_events *cpuc,
+ struct hw_perf_event *hwc)
+{
+ int cpu = smp_processor_id();
+
+#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
+ /* When counters are per-core, free them in all sibling CPUs */
+ if (!cpu_has_mipsmt_pertccounters) {
+ unsigned long flags;
+ int sibling_cpu;
+
+ spin_lock_irqsave(&core_counters_lock, flags);
+
+ for_each_cpu(sibling_cpu, &cpu_sibling_map[cpu]) {
+ cpuc = per_cpu_ptr(&cpu_hw_events, sibling_cpu);
+
+ clear_bit(hwc->idx, cpuc->used_mask);
+ pr_debug("CPU%d released core counter %d\n",
+ sibling_cpu, hwc->idx);
+ }
+
+ spin_unlock_irqrestore(&core_counters_lock, flags);
+ return;
+ }
+#endif
+ pr_debug("CPU%d released counter %d\n", cpu, hwc->idx);
+ clear_bit(hwc->idx, cpuc->used_mask);
+}
+
static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
{
struct perf_event *event = container_of(evt, struct perf_event, hw);
@@ -517,7 +557,7 @@ static void mipspmu_del(struct perf_event *event, int flags)
mipspmu_stop(event, PERF_EF_UPDATE);
cpuc->events[idx] = NULL;
- clear_bit(idx, cpuc->used_mask);
+ mipsxx_pmu_free_counter(cpuc, hwc);
perf_event_update_userpage(event);
}
@@ -1712,11 +1752,6 @@ init_hw_perf_events(void)
return -ENODEV;
}
-#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
- if (!cpu_has_mipsmt_pertccounters)
- counters = counters_total_to_per_cpu(counters);
-#endif
-
if (get_c0_perfcount_int)
irq = get_c0_perfcount_int();
else if (cp0_perfcount_irq >= 0)
@@ -1838,9 +1873,14 @@ init_hw_perf_events(void)
on_each_cpu(reset_counters, (void *)(long)counters, 1);
- pr_cont("%s PMU enabled, %d %d-bit counters available to each "
- "CPU, irq %d%s\n", mipspmu.name, counters, counter_bits, irq,
- irq < 0 ? " (share with timer interrupt)" : "");
+ pr_cont("%s PMU enabled, %d %d-bit counters available to each %s, irq %d%s\n",
+ mipspmu.name, counters, counter_bits,
+#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
+ cpu_has_mipsmt_pertccounters ? "CPU" : "core",
+#else
+ "CPU",
+#endif
+ irq, irq < 0 ? " (share with timer interrupt)" : "");
perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
--
2.7.4
When perf is used in system mode, i.e. specifying a set of CPUs to
count (perf -a -C cpu), event->cpu is set to the CPU number on which
events should be counted. The current BMIPS500 variation of
mipsxx_pmu_enable_event only over sets the counter to count the current
CPU, so system mode does not work.
Fix this by removing this BMIPS5000 specific path and integrating it
with the generic one. Since BMIPS5000 uses specific extensions to the
perf control register, different fields must be set up to count the
relevant CPU.
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3: None
Changes in v2:
New patch to fix BMIPS5000 system mode perf.
Florian, I don't have access to a BMIPS5000 board, but from code
inspection only I suspect this patch is necessary to have system mode
work. If someone could test that would be appreciated.
---
arch/mips/include/asm/mipsregs.h | 1 +
arch/mips/kernel/perf_event_mipsxx.c | 17 ++++++-----------
2 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index a4baaaa02bc8..3e1fbb7aaa2a 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -735,6 +735,7 @@
#define MIPS_PERFCTRL_MT_EN_TC (_ULCAST_(2) << 20)
/* PerfCnt control register MT extensions used by BMIPS5000 */
+#define BRCM_PERFCTRL_VPEID(v) (_ULCAST_(1) << (12 + v))
#define BRCM_PERFCTRL_TC (_ULCAST_(1) << 30)
/* PerfCnt control register MT extensions used by Netlogic XLR */
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index 5b8811643e60..77d7167e303b 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -364,16 +364,7 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
/* Make sure interrupt enabled. */
MIPS_PERFCTRL_IE;
-#ifdef CONFIG_CPU_BMIPS5000
- {
- /* enable the counter for the calling thread */
- unsigned int vpe_id;
-
- vpe_id = smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK;
- cpuc->saved_ctrl[idx] |= BIT(12 + vpe_id) | BRCM_PERFCTRL_TC;
- }
-#else
-#ifdef CONFIG_MIPS_MT_SMP
+#if defined(CONFIG_MIPS_MT_SMP) && !defined(CONFIG_CPU_BMIPS5000)
if (range > V) {
/* The counter is processor wide. Set it up to count all TCs. */
pr_debug("Enabling perf counter for all TCs\n");
@@ -390,12 +381,16 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
*/
cpu = (event->cpu >= 0) ? event->cpu : smp_processor_id();
+#if defined(CONFIG_CPU_BMIPS5000)
+ ctrl = BRCM_PERFCTRL_VPEID(cpu & MIPS_CPUID_TO_COUNTER_MASK);
+ ctrl |= BRCM_PERFCTRL_TC;
+#else
ctrl = M_PERFCTL_VPEID(cpu_vpe_id(&cpu_data[cpu]));
ctrl |= M_TC_EN_VPE;
+#endif
cpuc->saved_ctrl[idx] |= ctrl;
pr_debug("Enabling perf counter for CPU%d\n", cpu);
}
-#endif /* CONFIG_CPU_BMIPS5000 */
/*
* We do not actually let the counter run. Leave it until start().
*/
--
2.7.4
The vpe_id() macro is now used only in mipsxx_pmu_enable_event when
CONFIG_CPU_BMIPS5000 is defined. Fold the one used definition of the
macro into it's usage and remove the now unused definitions.
Since we know that cpu_has_mipsmt_pertccounters == 0 on BMIPS5000,
remove the test on it and just set the counter to count the relevant
VPE.
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3: None
Changes in v2:
Since BMIPS5000 does not implement per TC counters, we can remove the
check on cpu_has_mipsmt_pertccounters.
arch/mips/kernel/perf_event_mipsxx.c | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index a07777aa1b79..5b8811643e60 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -132,18 +132,6 @@ static struct mips_pmu mipspmu;
static DEFINE_SPINLOCK(core_counters_lock);
static DEFINE_RWLOCK(pmuint_rwlock);
-
-#if defined(CONFIG_CPU_BMIPS5000)
-#define vpe_id() (cpu_has_mipsmt_pertccounters ? \
- 0 : (smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK))
-#else
-#define vpe_id() (cpu_has_mipsmt_pertccounters ? \
- 0 : cpu_vpe_id(¤t_cpu_data))
-#endif
-
-#else /* !CONFIG_MIPS_PERF_SHARED_TC_COUNTERS */
-#define vpe_id() 0
-
#endif /* CONFIG_MIPS_PERF_SHARED_TC_COUNTERS */
static void resume_local_counters(void);
@@ -379,8 +367,10 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
#ifdef CONFIG_CPU_BMIPS5000
{
/* enable the counter for the calling thread */
- cpuc->saved_ctrl[idx] |=
- (1 << (12 + vpe_id())) | BRCM_PERFCTRL_TC;
+ unsigned int vpe_id;
+
+ vpe_id = smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK;
+ cpuc->saved_ctrl[idx] |= BIT(12 + vpe_id) | BRCM_PERFCTRL_TC;
}
#else
#ifdef CONFIG_MIPS_MT_SMP
--
2.7.4
The presence of per TC performance counters is now detected by
cpu-probe.c and indicated by MIPS_CPU_MT_PER_TC_PERF_COUNTERS in
cpu_data. Switch detection of the feature to use this new flag rather
than blindly testing the implementation specific config7 register with a
magic number.
Signed-off-by: Matt Redfearn <[email protected]>
---
Changes in v3:
Use flag in cpu_data set by cpu_probe.c to indicate feature presence.
Changes in v2: None
arch/mips/include/asm/cpu-features.h | 7 +++++++
arch/mips/kernel/perf_event_mipsxx.c | 3 ---
arch/mips/oprofile/op_model_mipsxx.c | 2 --
3 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/arch/mips/include/asm/cpu-features.h b/arch/mips/include/asm/cpu-features.h
index 721b698bfe3c..69755d900c69 100644
--- a/arch/mips/include/asm/cpu-features.h
+++ b/arch/mips/include/asm/cpu-features.h
@@ -534,6 +534,13 @@
# define cpu_has_shared_ftlb_entries 0
#endif
+#ifdef CONFIG_MIPS_MT_SMP
+# define cpu_has_mipsmt_pertccounters \
+ (cpu_data[0].options & MIPS_CPU_MT_PER_TC_PERF_COUNTERS)
+#else
+# define cpu_has_mipsmt_pertccounters 0
+#endif /* CONFIG_MIPS_MT_SMP */
+
/*
* Guest capabilities
*/
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index 6668f67a61c3..0595a974bc81 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -129,8 +129,6 @@ static struct mips_pmu mipspmu;
#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
-static int cpu_has_mipsmt_pertccounters;
-
static DEFINE_RWLOCK(pmuint_rwlock);
#if defined(CONFIG_CPU_BMIPS5000)
@@ -1723,7 +1721,6 @@ init_hw_perf_events(void)
}
#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS
- cpu_has_mipsmt_pertccounters = read_c0_config7() & (1<<19);
if (!cpu_has_mipsmt_pertccounters)
counters = counters_total_to_per_cpu(counters);
#endif
diff --git a/arch/mips/oprofile/op_model_mipsxx.c b/arch/mips/oprofile/op_model_mipsxx.c
index c3e4c18ef8d4..7c04b17f4a48 100644
--- a/arch/mips/oprofile/op_model_mipsxx.c
+++ b/arch/mips/oprofile/op_model_mipsxx.c
@@ -36,7 +36,6 @@ static int perfcount_irq;
#endif
#ifdef CONFIG_MIPS_MT_SMP
-static int cpu_has_mipsmt_pertccounters;
#define WHAT (MIPS_PERFCTRL_MT_EN_VPE | \
M_PERFCTL_VPEID(cpu_vpe_id(¤t_cpu_data)))
#define vpe_id() (cpu_has_mipsmt_pertccounters ? \
@@ -326,7 +325,6 @@ static int __init mipsxx_init(void)
}
#ifdef CONFIG_MIPS_MT_SMP
- cpu_has_mipsmt_pertccounters = read_c0_config7() & (1<<19);
if (!cpu_has_mipsmt_pertccounters)
counters = counters_total_to_per_cpu(counters);
#endif
--
2.7.4
On 04/20/2018 03:23 AM, Matt Redfearn wrote:
> This series addresses a few issues with how the MIPS performance
> counters code supports the hardware multithreading MT ASE.
>
> Firstly, implementations of the MT ASE may implement performance
> counters
> per core or per thread(TC). MIPS Techologies implementations signal this
> via a bit in the implmentation specific CONFIG7 register. Since this
> register is implementation specific, checking it should be guarded by a
> PRID check. This also replaces a bit defined by a magic number.
>
> Secondly, the code currently uses vpe_id(), defined as
> smp_processor_id(), divided by 2, to share core performance counters
> between VPEs. This relies on a couple of assumptions of the hardware
> implementation to function correctly (always 2 VPEs per core, and the
> hardware reading only the least significant bit).
>
> Finally, the method of sharing core performance counters between VPEs is
> suboptimal since it does not allow one process running on a VPE to use
> all of the performance counters available to it, because the kernel will
> reserve half of the coutners for the other VPE even if it may never use
> them. This reservation at counter probe is replaced with an allocation
> on use strategy.
>
> Tested on a MIPS Creator CI40 (2C2T MIPS InterAptiv with per-TC
> counters, though for the purposes of testing the per-TC availability was
> hardcoded to allow testing both paths).
>
> Series applies to v4.16
Sorry it took so long to get that tested.
Sounds like you need to build test this on a BMIPS5000 configuration
(bmips_stb_defconfig should provide that):
In file included from ./arch/mips/include/asm/mach-generic/spaces.h:15:0,
from ./arch/mips/include/asm/mach-bmips/spaces.h:16,
from ./arch/mips/include/asm/addrspace.h:13,
from ./arch/mips/include/asm/barrier.h:11,
from ./include/linux/compiler.h:245,
from ./include/linux/kernel.h:10,
from ./include/linux/cpumask.h:10,
from arch/mips/kernel/perf_event_mipsxx.c:18:
arch/mips/kernel/perf_event_mipsxx.c: In function 'mipsxx_pmu_enable_event':
./arch/mips/include/asm/mipsregs.h:738:52: error: suggest parentheses
around '+' in operand of '&' [-Werror=parentheses]
#define BRCM_PERFCTRL_VPEID(v) (_ULCAST_(1) << (12 + v))
arch/mips/kernel/perf_event_mipsxx.c:385:10: note: in expansion of macro
'BRCM_PERFCTRL_VPEID'
ctrl = BRCM_PERFCTRL_VPEID(cpu & MIPS_CPUID_TO_COUNTER_MASK);
^~~~~~~~~~~~~~~~~~~
CC drivers/of/fdt_addres
after fixing that, I tried the following to see whether this would be a
good test case to exercise against:
perf record -a -C 0 taskset -c 1 /bin/true
perf record -a -C 1 taskset -c 0 /bin/true
and would not see anything related to /bin/true running in either case,
which seems like it does the right thing?
Tested-by: Florian Fainelli <[email protected]>
BTW, for some reason not specifying -a -C <cpu> does lead to lockups,
consistently and for pretty much any perf command, this could be BMIPS
specific, did not get a chance to cross test on a different machine.
>
>
> Changes in v3:
> New patch to detect feature presence in cpu-probe.c
> Use flag in cpu_data set by cpu_probe.c to indicate feature presence.
> - rebase on new feature detection
>
> Changes in v2:
> Fix mipsxx_pmu_enable_event for !#ifdef CONFIG_MIPS_MT_SMP
> - Fix !#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS build
> - re-use cpuc variable in mipsxx_pmu_alloc_counter,
> mipsxx_pmu_free_counter rather than having sibling_ version.
> Since BMIPS5000 does not implement per TC counters, we can remove the
> check on cpu_has_mipsmt_pertccounters.
> New patch to fix BMIPS5000 system mode perf.
>
> Matt Redfearn (7):
> MIPS: Probe for MIPS MT perf counters per TC
> MIPS: perf: More robustly probe for the presence of per-tc counters
> MIPS: perf: Use correct VPE ID when setting up VPE tracing
> MIPS: perf: Fix perf with MT counting other threads
> MIPS: perf: Allocate per-core counters on demand
> MIPS: perf: Fold vpe_id() macro into it's one last usage
> MIPS: perf: Fix BMIPS5000 system mode counting
>
> arch/mips/include/asm/cpu-features.h | 7 ++
> arch/mips/include/asm/cpu.h | 2 +
> arch/mips/include/asm/mipsregs.h | 6 +
> arch/mips/kernel/cpu-probe.c | 12 ++
> arch/mips/kernel/perf_event_mipsxx.c | 232 +++++++++++++++++++----------------
> arch/mips/oprofile/op_model_mipsxx.c | 2 -
> 6 files changed, 150 insertions(+), 111 deletions(-)
>
--
Florian
On 20/04/18 23:51, Florian Fainelli wrote:
> On 04/20/2018 03:23 AM, Matt Redfearn wrote:
>> This series addresses a few issues with how the MIPS performance
>> counters code supports the hardware multithreading MT ASE.
>>
>> Firstly, implementations of the MT ASE may implement performance
>> counters
>> per core or per thread(TC). MIPS Techologies implementations signal this
>> via a bit in the implmentation specific CONFIG7 register. Since this
>> register is implementation specific, checking it should be guarded by a
>> PRID check. This also replaces a bit defined by a magic number.
>>
>> Secondly, the code currently uses vpe_id(), defined as
>> smp_processor_id(), divided by 2, to share core performance counters
>> between VPEs. This relies on a couple of assumptions of the hardware
>> implementation to function correctly (always 2 VPEs per core, and the
>> hardware reading only the least significant bit).
>>
>> Finally, the method of sharing core performance counters between VPEs is
>> suboptimal since it does not allow one process running on a VPE to use
>> all of the performance counters available to it, because the kernel will
>> reserve half of the coutners for the other VPE even if it may never use
>> them. This reservation at counter probe is replaced with an allocation
>> on use strategy.
>>
>> Tested on a MIPS Creator CI40 (2C2T MIPS InterAptiv with per-TC
>> counters, though for the purposes of testing the per-TC availability was
>> hardcoded to allow testing both paths).
>>
>> Series applies to v4.16
>
> Sorry it took so long to get that tested.
Hi Florian,
Thanks for testing!
>
> Sounds like you need to build test this on a BMIPS5000 configuration
> (bmips_stb_defconfig should provide that):
>
> In file included from ./arch/mips/include/asm/mach-generic/spaces.h:15:0,
> from ./arch/mips/include/asm/mach-bmips/spaces.h:16,
> from ./arch/mips/include/asm/addrspace.h:13,
> from ./arch/mips/include/asm/barrier.h:11,
> from ./include/linux/compiler.h:245,
> from ./include/linux/kernel.h:10,
> from ./include/linux/cpumask.h:10,
> from arch/mips/kernel/perf_event_mipsxx.c:18:
> arch/mips/kernel/perf_event_mipsxx.c: In function 'mipsxx_pmu_enable_event':
> ./arch/mips/include/asm/mipsregs.h:738:52: error: suggest parentheses
> around '+' in operand of '&' [-Werror=parentheses]
> #define BRCM_PERFCTRL_VPEID(v) (_ULCAST_(1) << (12 + v))
>
> arch/mips/kernel/perf_event_mipsxx.c:385:10: note: in expansion of macro
> 'BRCM_PERFCTRL_VPEID'
> ctrl = BRCM_PERFCTRL_VPEID(cpu & MIPS_CPUID_TO_COUNTER_MASK);
> ^~~~~~~~~~~~~~~~~~~
> CC drivers/of/fdt_addres
Good spot - I've updated the patch to
+#define BRCM_PERFCTRL_VPEID(v) (_ULCAST_(1) << (12 + (v)))
to fix that.
>
> after fixing that, I tried the following to see whether this would be a
> good test case to exercise against:
>
> perf record -a -C 0 taskset -c 1 /bin/true
> perf record -a -C 1 taskset -c 0 /bin/true
>
> and would not see anything related to /bin/true running in either case,
> which seems like it does the right thing?
I've generally been testing using this code:
perf_test.S:
#include <asm/unistd.h>
#define ITERATIONS 10000
.text
.global __start
.type __start, @function;
__start:
.set noreorder
li $2, ITERATIONS
1:
addiu $2,$2,-1
bnez $2, 1b
nop
li $2, __NR_exit
syscall
Makefile:
$(CC) $(ISA_FLAG) $(ABI_FLAG) $(ENDIAN_FLAG) -static -nostartfiles -O2
-o perf_test perf_test.S
Then running perf which should report the right counts:
taskset 1 perf stat -e instructions:u,branches:u ./perf_test
Performance counter stats for './perf_test':
30002 instructions:u
10000 branches:u
0.005179467 seconds time elapsed
System mode should also work:
# perf stat -e instructions:u,branches:u -a -C 2
^C
Performance counter stats for 'system wide':
1416 instructions:u
198 branches:u
4.454874812 seconds time elapsed
>
> Tested-by: Florian Fainelli <[email protected]>
Thanks!
>
> BTW, for some reason not specifying -a -C <cpu> does lead to lockups,
> consistently and for pretty much any perf command, this could be BMIPS
> specific, did not get a chance to cross test on a different machine.
Interesting.... FWIW I don't get lockups on ci40 (MIPS InterAptiv). Is
this a regression with this series applied or an existing problem?
Thanks,
Matt
>
>>
>>
>> Changes in v3:
>> New patch to detect feature presence in cpu-probe.c
>> Use flag in cpu_data set by cpu_probe.c to indicate feature presence.
>> - rebase on new feature detection
>>
>> Changes in v2:
>> Fix mipsxx_pmu_enable_event for !#ifdef CONFIG_MIPS_MT_SMP
>> - Fix !#ifdef CONFIG_MIPS_PERF_SHARED_TC_COUNTERS build
>> - re-use cpuc variable in mipsxx_pmu_alloc_counter,
>> mipsxx_pmu_free_counter rather than having sibling_ version.
>> Since BMIPS5000 does not implement per TC counters, we can remove the
>> check on cpu_has_mipsmt_pertccounters.
>> New patch to fix BMIPS5000 system mode perf.
>>
>> Matt Redfearn (7):
>> MIPS: Probe for MIPS MT perf counters per TC
>> MIPS: perf: More robustly probe for the presence of per-tc counters
>> MIPS: perf: Use correct VPE ID when setting up VPE tracing
>> MIPS: perf: Fix perf with MT counting other threads
>> MIPS: perf: Allocate per-core counters on demand
>> MIPS: perf: Fold vpe_id() macro into it's one last usage
>> MIPS: perf: Fix BMIPS5000 system mode counting
>>
>> arch/mips/include/asm/cpu-features.h | 7 ++
>> arch/mips/include/asm/cpu.h | 2 +
>> arch/mips/include/asm/mipsregs.h | 6 +
>> arch/mips/kernel/cpu-probe.c | 12 ++
>> arch/mips/kernel/perf_event_mipsxx.c | 232 +++++++++++++++++++----------------
>> arch/mips/oprofile/op_model_mipsxx.c | 2 -
>> 6 files changed, 150 insertions(+), 111 deletions(-)
>>
>
>
When perf is used in system mode, i.e. specifying a set of CPUs to
count (perf -a -C cpu), event->cpu is set to the CPU number on which
events should be counted. The current BMIPS500 variation of
mipsxx_pmu_enable_event only over sets the counter to count the current
CPU, so system mode does not work.
Fix this by removing this BMIPS5000 specific path and integrating it
with the generic one. Since BMIPS5000 uses specific extensions to the
perf control register, different fields must be set up to count the
relevant CPU.
Signed-off-by: Matt Redfearn <[email protected]>
Tested-by: Florian Fainelli <[email protected]>
---
Changes in v4:
Fix compiler error from BRCM_PERFCTRL_VPEID flagged by Florian.
Changes in v2:
New patch to fix BMIPS5000 system mode perf.
arch/mips/include/asm/mipsregs.h | 1 +
arch/mips/kernel/perf_event_mipsxx.c | 17 ++++++-----------
2 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index a4baaaa02bc..6b0b06d2683 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -735,6 +735,7 @@
#define MIPS_PERFCTRL_MT_EN_TC (_ULCAST_(2) << 20)
/* PerfCnt control register MT extensions used by BMIPS5000 */
+#define BRCM_PERFCTRL_VPEID(v) (_ULCAST_(1) << (12 + (v)))
#define BRCM_PERFCTRL_TC (_ULCAST_(1) << 30)
/* PerfCnt control register MT extensions used by Netlogic XLR */
diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
index 5b8811643e6..77d7167e303 100644
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -364,16 +364,7 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
/* Make sure interrupt enabled. */
MIPS_PERFCTRL_IE;
-#ifdef CONFIG_CPU_BMIPS5000
- {
- /* enable the counter for the calling thread */
- unsigned int vpe_id;
-
- vpe_id = smp_processor_id() & MIPS_CPUID_TO_COUNTER_MASK;
- cpuc->saved_ctrl[idx] |= BIT(12 + vpe_id) | BRCM_PERFCTRL_TC;
- }
-#else
-#ifdef CONFIG_MIPS_MT_SMP
+#if defined(CONFIG_MIPS_MT_SMP) && !defined(CONFIG_CPU_BMIPS5000)
if (range > V) {
/* The counter is processor wide. Set it up to count all TCs. */
pr_debug("Enabling perf counter for all TCs\n");
@@ -390,12 +381,16 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
*/
cpu = (event->cpu >= 0) ? event->cpu : smp_processor_id();
+#if defined(CONFIG_CPU_BMIPS5000)
+ ctrl = BRCM_PERFCTRL_VPEID(cpu & MIPS_CPUID_TO_COUNTER_MASK);
+ ctrl |= BRCM_PERFCTRL_TC;
+#else
ctrl = M_PERFCTL_VPEID(cpu_vpe_id(&cpu_data[cpu]));
ctrl |= M_TC_EN_VPE;
+#endif
cpuc->saved_ctrl[idx] |= ctrl;
pr_debug("Enabling perf counter for CPU%d\n", cpu);
}
-#endif /* CONFIG_CPU_BMIPS5000 */
/*
* We do not actually let the counter run. Leave it until start().
*/
--
2.7.4
On Fri, Apr 20, 2018 at 11:23:06AM +0100, Matt Redfearn wrote:
> diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
> index 7e2b7d38a774..fe50986e83c6 100644
> --- a/arch/mips/kernel/perf_event_mipsxx.c
> +++ b/arch/mips/kernel/perf_event_mipsxx.c
> @@ -323,7 +323,11 @@ static int mipsxx_pmu_alloc_counter(struct cpu_hw_events *cpuc,
>
> static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
> {
> + struct perf_event *event = container_of(evt, struct perf_event, hw);
> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
> +#ifdef CONFIG_MIPS_MT_SMP
> + unsigned int range = evt->event_base >> 24;
> +#endif /* CONFIG_MIPS_MT_SMP */
>
> WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
>
> @@ -331,11 +335,37 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
> (evt->config_base & M_PERFCTL_CONFIG_MASK) |
> /* Make sure interrupt enabled. */
> MIPS_PERFCTRL_IE;
> - if (IS_ENABLED(CONFIG_CPU_BMIPS5000))
> +
> +#ifdef CONFIG_CPU_BMIPS5000
> + {
> /* enable the counter for the calling thread */
> cpuc->saved_ctrl[idx] |=
> (1 << (12 + vpe_id())) | BRCM_PERFCTRL_TC;
> + }
> +#else
> +#ifdef CONFIG_MIPS_MT_SMP
> + if (range > V) {
> + /* The counter is processor wide. Set it up to count all TCs. */
> + pr_debug("Enabling perf counter for all TCs\n");
> + cpuc->saved_ctrl[idx] |= M_TC_EN_ALL;
> + } else
> +#endif /* CONFIG_MIPS_MT_SMP */
> + {
> + unsigned int cpu, ctrl;
>
> + /*
> + * Set up the counter for a particular CPU when event->cpu is
> + * a valid CPU number. Otherwise set up the counter for the CPU
> + * scheduling this thread.
> + */
> + cpu = (event->cpu >= 0) ? event->cpu : smp_processor_id();
> +
> + ctrl = M_PERFCTL_VPEID(cpu_vpe_id(&cpu_data[cpu]));
> + ctrl |= M_TC_EN_VPE;
> + cpuc->saved_ctrl[idx] |= ctrl;
> + pr_debug("Enabling perf counter for CPU%d\n", cpu);
> + }
> +#endif /* CONFIG_CPU_BMIPS5000 */
I'm not a huge fan of the ifdefery tbh, I don't think it makes it very
easy to read having a combination of ifs and #ifdefs. I reckon
IF_ENABLED would be better, perhaps with having the BMIPS5000 case
return to avoid too much nesting.
Otherwise the patch looks okay to me.
Thanks
James
On Fri, Apr 20, 2018 at 11:23:07AM +0100, Matt Redfearn wrote:
> Previously when performance counters are per-core, rather than
> per-thread, the number available were divided by 2 on detection, and the
> counters used by each thread in a core were "swizzled" to ensure
> separation. However, this solution is suboptimal since it relies on a
> couple of assumptions:
> a) Always having 2 VPEs / core (number of counters was divided by 2)
> b) Always having a number of counters implemented in the core that is
> divisible by 2. For instance if an SoC implementation had a single
> counter and 2 VPEs per core, then this logic would fail and no
> performance counters would be available.
> The mechanism also does not allow for one VPE in a core using more than
> it's allocation of the per-core counters to count multiple events even
> though other VPEs may not be using them.
>
> Fix this situation by instead allocating (and releasing) per-core
> performance counters when they are requested. This approach removes the
> above assumptions and fixes the shortcomings.
>
> In order to do this:
> Add additional logic to mipsxx_pmu_alloc_counter() to detect if a
> sibling is using a per-core counter, and to allocate a per-core counter
> in all sibling CPUs.
> Similarly, add a mipsxx_pmu_free_counter() function to release a
> per-core counter in all sibling CPUs when it is finished with.
> A new spinlock, core_counters_lock, is introduced to ensure exclusivity
> when allocating and releasing per-core counters.
> Since counters are now allocated per-core on demand, rather than being
> reserved per-thread at boot, all of the "swizzling" of counters is
> removed.
>
> The upshot is that in an SoC with 2 counters / thread, counters are
> reported as:
> Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
> available to each CPU, irq 18
>
> Running an instance of a test program on each of 2 threads in a
> core, both threads can use their 2 counters to count 2 events:
>
> taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
> perf stat -e instructions:u,branches:u ./test_prog
>
> Performance counter stats for './test_prog':
>
> 30002 instructions:u
> 10000 branches:u
>
> 0.005164264 seconds time elapsed
> Performance counter stats for './test_prog':
>
> 30002 instructions:u
> 10000 branches:u
>
> 0.006139975 seconds time elapsed
>
> In an SoC with 2 counters / core (which can be forced by setting
> cpu_has_mipsmt_pertccounters = 0), counters are reported as:
> Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
> available to each core, irq 18
>
> Running an instance of a test program on each of 2 threads in a
> core, now only one thread manages to secure the performance counters to
> count 2 events. The other thread does not get any counters.
>
> taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
> perf stat -e instructions:u,branches:u ./test_prog
>
> Performance counter stats for './test_prog':
>
> <not counted> instructions:u
> <not counted> branches:u
>
> 0.005179533 seconds time elapsed
>
> Performance counter stats for './test_prog':
>
> 30002 instructions:u
> 10000 branches:u
>
> 0.005179467 seconds time elapsed
>
> Signed-off-by: Matt Redfearn <[email protected]>
While this sounds like an improvement in practice, being able to use
more counters on single threaded stuff than otherwise, I'm a little
concerned what would happen if a task was migrated to a different CPU
and the perf counters couldn't be obtained on the new CPU due to
counters already being in use. Would the values be incorrectly small?
Cheers
James
Hi James,
On 16/05/18 18:59, James Hogan wrote:
> On Fri, Apr 20, 2018 at 11:23:06AM +0100, Matt Redfearn wrote:
>> diff --git a/arch/mips/kernel/perf_event_mipsxx.c b/arch/mips/kernel/perf_event_mipsxx.c
>> index 7e2b7d38a774..fe50986e83c6 100644
>> --- a/arch/mips/kernel/perf_event_mipsxx.c
>> +++ b/arch/mips/kernel/perf_event_mipsxx.c
>> @@ -323,7 +323,11 @@ static int mipsxx_pmu_alloc_counter(struct cpu_hw_events *cpuc,
>>
>> static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
>> {
>> + struct perf_event *event = container_of(evt, struct perf_event, hw);
>> struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +#ifdef CONFIG_MIPS_MT_SMP
>> + unsigned int range = evt->event_base >> 24;
>> +#endif /* CONFIG_MIPS_MT_SMP */
>>
>> WARN_ON(idx < 0 || idx >= mipspmu.num_counters);
>>
>> @@ -331,11 +335,37 @@ static void mipsxx_pmu_enable_event(struct hw_perf_event *evt, int idx)
>> (evt->config_base & M_PERFCTL_CONFIG_MASK) |
>> /* Make sure interrupt enabled. */
>> MIPS_PERFCTRL_IE;
>> - if (IS_ENABLED(CONFIG_CPU_BMIPS5000))
>> +
>> +#ifdef CONFIG_CPU_BMIPS5000
>> + {
>> /* enable the counter for the calling thread */
>> cpuc->saved_ctrl[idx] |=
>> (1 << (12 + vpe_id())) | BRCM_PERFCTRL_TC;
>> + }
>> +#else
>> +#ifdef CONFIG_MIPS_MT_SMP
>> + if (range > V) {
>> + /* The counter is processor wide. Set it up to count all TCs. */
>> + pr_debug("Enabling perf counter for all TCs\n");
>> + cpuc->saved_ctrl[idx] |= M_TC_EN_ALL;
>> + } else
>> +#endif /* CONFIG_MIPS_MT_SMP */
>> + {
>> + unsigned int cpu, ctrl;
>>
>> + /*
>> + * Set up the counter for a particular CPU when event->cpu is
>> + * a valid CPU number. Otherwise set up the counter for the CPU
>> + * scheduling this thread.
>> + */
>> + cpu = (event->cpu >= 0) ? event->cpu : smp_processor_id();
>> +
>> + ctrl = M_PERFCTL_VPEID(cpu_vpe_id(&cpu_data[cpu]));
>> + ctrl |= M_TC_EN_VPE;
>> + cpuc->saved_ctrl[idx] |= ctrl;
>> + pr_debug("Enabling perf counter for CPU%d\n", cpu);
>> + }
>> +#endif /* CONFIG_CPU_BMIPS5000 */
>
> I'm not a huge fan of the ifdefery tbh, I don't think it makes it very
> easy to read having a combination of ifs and #ifdefs. I reckon
> IF_ENABLED would be better, perhaps with having the BMIPS5000 case
> return to avoid too much nesting.
OK, I'll try and tidy it up.
Thanks,
Matt
>
> Otherwise the patch looks okay to me.
>
> Thanks
> James
>
Hi James,
On 16/05/18 19:05, James Hogan wrote:
> On Fri, Apr 20, 2018 at 11:23:07AM +0100, Matt Redfearn wrote:
>> Previously when performance counters are per-core, rather than
>> per-thread, the number available were divided by 2 on detection, and the
>> counters used by each thread in a core were "swizzled" to ensure
>> separation. However, this solution is suboptimal since it relies on a
>> couple of assumptions:
>> a) Always having 2 VPEs / core (number of counters was divided by 2)
>> b) Always having a number of counters implemented in the core that is
>> divisible by 2. For instance if an SoC implementation had a single
>> counter and 2 VPEs per core, then this logic would fail and no
>> performance counters would be available.
>> The mechanism also does not allow for one VPE in a core using more than
>> it's allocation of the per-core counters to count multiple events even
>> though other VPEs may not be using them.
>>
>> Fix this situation by instead allocating (and releasing) per-core
>> performance counters when they are requested. This approach removes the
>> above assumptions and fixes the shortcomings.
>>
>> In order to do this:
>> Add additional logic to mipsxx_pmu_alloc_counter() to detect if a
>> sibling is using a per-core counter, and to allocate a per-core counter
>> in all sibling CPUs.
>> Similarly, add a mipsxx_pmu_free_counter() function to release a
>> per-core counter in all sibling CPUs when it is finished with.
>> A new spinlock, core_counters_lock, is introduced to ensure exclusivity
>> when allocating and releasing per-core counters.
>> Since counters are now allocated per-core on demand, rather than being
>> reserved per-thread at boot, all of the "swizzling" of counters is
>> removed.
>>
>> The upshot is that in an SoC with 2 counters / thread, counters are
>> reported as:
>> Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
>> available to each CPU, irq 18
>>
>> Running an instance of a test program on each of 2 threads in a
>> core, both threads can use their 2 counters to count 2 events:
>>
>> taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
>> perf stat -e instructions:u,branches:u ./test_prog
>>
>> Performance counter stats for './test_prog':
>>
>> 30002 instructions:u
>> 10000 branches:u
>>
>> 0.005164264 seconds time elapsed
>> Performance counter stats for './test_prog':
>>
>> 30002 instructions:u
>> 10000 branches:u
>>
>> 0.006139975 seconds time elapsed
>>
>> In an SoC with 2 counters / core (which can be forced by setting
>> cpu_has_mipsmt_pertccounters = 0), counters are reported as:
>> Performance counters: mips/interAptiv PMU enabled, 2 32-bit counters
>> available to each core, irq 18
>>
>> Running an instance of a test program on each of 2 threads in a/soak/bin/bashsoak -E cpuhotplugHi
>> core, now only one thread manages to secure the performance counters to
>> count 2 events. The other thread does not get any counters.
>>
>> taskset 4 perf stat -e instructions:u,branches:u ./test_prog & taskset 8
>> perf stat -e instructions:u,branches:u ./test_prog
>>
>> Performance counter stats for './test_prog':
>>
>> <not counted> instructions:u
>> <not counted> branches:u
>>
>> 0.005179533 seconds time elapsed
>>
>> Performance counter stats for './test_prog':
>>
>> 30002 instructions:u
>> 10000 branches:u
>>
>> 0.005179467 seconds time elapsed
>>
>> Signed-off-by: Matt Redfearn <[email protected]>
>
> While this sounds like an improvement in practice, being able to use
> more counters on single threaded stuff than otherwise, I'm a little
> concerned what would happen if a task was migrated to a different CPU
> and the perf counters couldn't be obtained on the new CPU due to
> counters already being in use. Would the values be incorrectly small?
This change was really forced by the new I7200 development. Current
configurations have 2 counters per core, but each core has 3 VPEs -
which means the current logic cannot correctly assign counters. IoW the
2 assumptions stated in the commit log are no longer true.
Though you are right that if a task migrated to a core on which another
VPE is already using the counters, this change would mean counters
cannot be assigned. In that case we return EAGAIN. I'm not sure if that
error would be handled gracefully by the scheduler and the task
scheduled away again... The code events logic that backs this is tricky
to follow.
Thanks,
Matt
>
> Cheers
> James
>