Changes since v1:
* Removed "legacy" description of PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE events
* Shift down remaining capability bits instead of inserting
/* Unused */ in the gap (last commit)
Applies to v6.5-rc3
---------------------
This came out of the discussion here [1]. It seems like we can get some
extra big.LITTLE stuff working pretty easily. The test issues mentioned
in the linked thread are actually fairly unrelated and I've fixed them
in a different set on the list.
After adding it in the first commit, the remaining ones tidy up a
related capability that doesn't do anything any more.
I've added a fixes tag for the commit where
PERF_PMU_CAP_EXTENDED_HW_TYPE was originally added because it probably
should have been added to the Arm PMU at the same time. It doesn't apply
cleanly that far back because another capability was added between then,
but the resolution is trivial.
Thanks
James
[1]: https://lore.kernel.org/linux-perf-users/CAP-5=fVkRc9=ySJ=fG-SQ8oAKmE_1mhHHzSASmGHUsda5Qy92A@mail.gmail.com/T/#t
James Clark (4):
arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
arch/x86/events/core.c | 1 -
drivers/perf/arm_pmu.c | 10 ++++++----
include/linux/perf_event.h | 7 +++----
3 files changed, 9 insertions(+), 9 deletions(-)
--
2.34.1
Since commit bd2756811766 ("perf: Rewrite core context handling") the
relationship between perf_event_context and PMUs has changed so that
the error scenario that PERF_PMU_CAP_HETEROGENEOUS_CPUS originally
silenced no longer exists.
Remove the capability to avoid confusion that it actually influences
any perf core behavior. This change should be a no-op.
Acked-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
---
arch/x86/events/core.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 9d248703cbdd..2353aaf0b248 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2168,7 +2168,6 @@ static int __init init_hw_perf_events(void)
hybrid_pmu->pmu = pmu;
hybrid_pmu->pmu.type = -1;
hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
- hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_EXTENDED_HW_TYPE;
err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
--
2.34.1
This capability gives us the ability to open PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE events on a specific PMU for free. All the
implementation is contained in the Perf core and tool code so no change
to the Arm PMU driver is needed.
The following basic use case now results in Perf opening the event on
all PMUs rather than picking only one in an unpredictable way:
$ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2
Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2':
963279620 armv8_cortex_a57/cycles/ (99.19%)
752745657 armv8_cortex_a53/cycles/ (94.80%)
Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
Suggested-by: Ian Rogers <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
---
drivers/perf/arm_pmu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index f6ccb2cd4dfc..2e79201daa4a 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -880,8 +880,13 @@ struct arm_pmu *armpmu_alloc(void)
* configuration (e.g. big.LITTLE). This is not an uncore PMU,
* and we have taken ctx sharing into account (e.g. with our
* pmu::filter callback and pmu::event_init group validation).
+ *
+ * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open
+ * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a
+ * specific PMU.
*/
- .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS,
+ .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS |
+ PERF_PMU_CAP_EXTENDED_HW_TYPE,
};
pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
--
2.34.1
On 7/24/23 19:14, James Clark wrote:
> This capability gives us the ability to open PERF_TYPE_HARDWARE and
> PERF_TYPE_HW_CACHE events on a specific PMU for free. All the
> implementation is contained in the Perf core and tool code so no change
> to the Arm PMU driver is needed.
>
> The following basic use case now results in Perf opening the event on
> all PMUs rather than picking only one in an unpredictable way:
>
> $ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2
>
> Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2':
>
> 963279620 armv8_cortex_a57/cycles/ (99.19%)
> 752745657 armv8_cortex_a53/cycles/ (94.80%)
>
> Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
> Suggested-by: Ian Rogers <[email protected]>
> Acked-by: Ian Rogers <[email protected]>
> Signed-off-by: James Clark <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
> ---
> drivers/perf/arm_pmu.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index f6ccb2cd4dfc..2e79201daa4a 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -880,8 +880,13 @@ struct arm_pmu *armpmu_alloc(void)
> * configuration (e.g. big.LITTLE). This is not an uncore PMU,
> * and we have taken ctx sharing into account (e.g. with our
> * pmu::filter callback and pmu::event_init group validation).
> + *
> + * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open
> + * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a
> + * specific PMU.
> */
> - .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS,
> + .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS |
> + PERF_PMU_CAP_EXTENDED_HW_TYPE,
> };
>
> pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
On Mon, Jul 24, 2023 at 02:44:55PM +0100, James Clark wrote:
> James Clark (4):
> arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
> perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
> arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
> perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
>
> arch/x86/events/core.c | 1 -
> drivers/perf/arm_pmu.c | 10 ++++++----
> include/linux/perf_event.h | 7 +++----
> 3 files changed, 9 insertions(+), 9 deletions(-)
Thanks!
On Tue, Jul 25, 2023 at 02:10:23PM +0200, Peter Zijlstra wrote:
> On Mon, Jul 24, 2023 at 02:44:55PM +0100, James Clark wrote:
>
> > James Clark (4):
> > arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
> > perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
> > arm_pmu: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
> > perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
> >
> > arch/x86/events/core.c | 1 -
> > drivers/perf/arm_pmu.c | 10 ++++++----
> > include/linux/perf_event.h | 7 +++----
> > 3 files changed, 9 insertions(+), 9 deletions(-)
>
> Thanks!
Ah, I see that you've queued that in your perf/core branch in your queue tree.
Given that, I assume you don't need anything from me or from Will, but just in
case, for the series:
Acked-by: Mark Rutland <[email protected]>
Mark.
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 4b36873b4a3455590f686903c354c4716e149c74
Gitweb: https://git.kernel.org/tip/4b36873b4a3455590f686903c354c4716e149c74
Author: James Clark <[email protected]>
AuthorDate: Mon, 24 Jul 2023 14:44:57 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 26 Jul 2023 12:28:46 +02:00
perf/x86: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability
Since commit bd2756811766 ("perf: Rewrite core context handling") the
relationship between perf_event_context and PMUs has changed so that
the error scenario that PERF_PMU_CAP_HETEROGENEOUS_CPUS originally
silenced no longer exists.
Remove the capability to avoid confusion that it actually influences
any perf core behavior. This change should be a no-op.
Signed-off-by: James Clark <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
arch/x86/events/core.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 23c9642..185f902 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2166,7 +2166,6 @@ static int __init init_hw_perf_events(void)
hybrid_pmu->pmu = pmu;
hybrid_pmu->pmu.type = -1;
hybrid_pmu->pmu.attr_update = x86_pmu.attr_update;
- hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_HETEROGENEOUS_CPUS;
hybrid_pmu->pmu.capabilities |= PERF_PMU_CAP_EXTENDED_HW_TYPE;
err = perf_pmu_register(&hybrid_pmu->pmu, hybrid_pmu->name,
The following commit has been merged into the perf/core branch of tip:
Commit-ID: 5c816728651ae425954542fed64d21d40cb75a9f
Gitweb: https://git.kernel.org/tip/5c816728651ae425954542fed64d21d40cb75a9f
Author: James Clark <[email protected]>
AuthorDate: Mon, 24 Jul 2023 14:44:56 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 26 Jul 2023 12:28:46 +02:00
arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability
This capability gives us the ability to open PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE events on a specific PMU for free. All the
implementation is contained in the Perf core and tool code so no change
to the Arm PMU driver is needed.
The following basic use case now results in Perf opening the event on
all PMUs rather than picking only one in an unpredictable way:
$ perf stat -e cycles -- taskset --cpu-list 0,1 stress -c 2
Performance counter stats for 'taskset --cpu-list 0,1 stress -c 2':
963279620 armv8_cortex_a57/cycles/ (99.19%)
752745657 armv8_cortex_a53/cycles/ (94.80%)
Fixes: 55bcf6ef314a ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")
Suggested-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/perf/arm_pmu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index f6ccb2c..2e79201 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -880,8 +880,13 @@ struct arm_pmu *armpmu_alloc(void)
* configuration (e.g. big.LITTLE). This is not an uncore PMU,
* and we have taken ctx sharing into account (e.g. with our
* pmu::filter callback and pmu::event_init group validation).
+ *
+ * PERF_PMU_CAP_EXTENDED_HW_TYPE is required to open
+ * PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE events on a
+ * specific PMU.
*/
- .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS,
+ .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS |
+ PERF_PMU_CAP_EXTENDED_HW_TYPE,
};
pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =