The current code unconditionally clears cpuc->event_constraint[i]
before calling get_event_constraints(.idx=i). The only site that cares
is intel_get_event_constraints() where the c1 load will always be
NULL.
However, always calling get_event_constraints() on all events is
wastefull, most times it will return the exact same result. Therefore
retain the logic in intel_get_event_constraints() and change the
generic code to only clear the constraint on put.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/x86/events/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -858,7 +858,6 @@ int x86_schedule_events(struct cpu_hw_ev
x86_pmu.start_scheduling(cpuc);
for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
- cpuc->event_constraint[i] = NULL;
c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
cpuc->event_constraint[i] = c;
@@ -941,6 +940,8 @@ int x86_schedule_events(struct cpu_hw_ev
*/
if (x86_pmu.put_event_constraints)
x86_pmu.put_event_constraints(cpuc, e);
+
+ cpuc->event_constraint[i] = NULL;
}
}
@@ -1404,6 +1405,7 @@ static void x86_pmu_del(struct perf_even
cpuc->event_list[i-1] = cpuc->event_list[i];
cpuc->event_constraint[i-1] = cpuc->event_constraint[i];
}
+ cpuc->event_constraint[i-1] = NULL;
--cpuc->n_events;
perf_event_update_userpage(event);
On Thu, Mar 14, 2019 at 6:11 AM Peter Zijlstra <[email protected]> wrote:
>
> The current code unconditionally clears cpuc->event_constraint[i]
> before calling get_event_constraints(.idx=i). The only site that cares
> is intel_get_event_constraints() where the c1 load will always be
> NULL.
>
> However, always calling get_event_constraints() on all events is
> wastefull, most times it will return the exact same result. Therefore
> retain the logic in intel_get_event_constraints() and change the
> generic code to only clear the constraint on put.
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
I thought there was caching of the constraint in case it was static
(or unconstrained)
to avoid walking the constraint table each time between invocations
on the same group_sched_in() call. But the way the c1 vs. c2 logic
is written I don't see it. In which case, this could be another opportunity.
Looks good to me.
Reviewed-by: Stephane Eranian <[email protected]>
> ---
> arch/x86/events/core.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -858,7 +858,6 @@ int x86_schedule_events(struct cpu_hw_ev
> x86_pmu.start_scheduling(cpuc);
>
> for (i = 0, wmin = X86_PMC_IDX_MAX, wmax = 0; i < n; i++) {
> - cpuc->event_constraint[i] = NULL;
> c = x86_pmu.get_event_constraints(cpuc, i, cpuc->event_list[i]);
> cpuc->event_constraint[i] = c;
>
> @@ -941,6 +940,8 @@ int x86_schedule_events(struct cpu_hw_ev
> */
> if (x86_pmu.put_event_constraints)
> x86_pmu.put_event_constraints(cpuc, e);
> +
> + cpuc->event_constraint[i] = NULL;
> }
> }
>
> @@ -1404,6 +1405,7 @@ static void x86_pmu_del(struct perf_even
> cpuc->event_list[i-1] = cpuc->event_list[i];
> cpuc->event_constraint[i-1] = cpuc->event_constraint[i];
> }
> + cpuc->event_constraint[i-1] = NULL;
> --cpuc->n_events;
>
> perf_event_update_userpage(event);
>
>
On Tue, Mar 19, 2019 at 02:50:21PM -0700, Stephane Eranian wrote:
> On Thu, Mar 14, 2019 at 6:11 AM Peter Zijlstra <[email protected]> wrote:
> >
> > The current code unconditionally clears cpuc->event_constraint[i]
> > before calling get_event_constraints(.idx=i). The only site that cares
> > is intel_get_event_constraints() where the c1 load will always be
> > NULL.
> >
> > However, always calling get_event_constraints() on all events is
> > wastefull, most times it will return the exact same result. Therefore
> > retain the logic in intel_get_event_constraints() and change the
> > generic code to only clear the constraint on put.
> >
> > Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
> I thought there was caching of the constraint in case it was static
> (or unconstrained)
Yeah, I was under the same impression :-/
> to avoid walking the constraint table each time between invocations
> on the same group_sched_in() call. But the way the c1 vs. c2 logic
> is written I don't see it. In which case, this could be another opportunity.
Right, the next patch attempts this.