Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp3034607pxp; Mon, 14 Mar 2022 09:31:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJypJZsYT6UzdGlFgdS8SevrqwW16o146OqjuViGPUO4M2nHnnDMnGrS6jN2L39K3XlL6Uzm X-Received: by 2002:a17:902:da81:b0:152:57b:5e7f with SMTP id j1-20020a170902da8100b00152057b5e7fmr24679989plx.46.1647275469828; Mon, 14 Mar 2022 09:31:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647275469; cv=none; d=google.com; s=arc-20160816; b=G1sMOLDupvSa5Zn54/pbi0feX3Nq1EdU+K3sqZjKJuEoCosGhDfsmXsfPx1iuyf52E 1z9khWJhZVS19hav+qoKpasuw/zaS7l+mpkeWFoQ3yZRdQruF3yoFvIEMwjueFhqpbqv UDUGYHKtgXES49qitvZCL7ZqDW3QmZajvQUUZTh45HjOZOJ9EPGL4ADsl8bcuh/Vc2XZ Ph9uR8kHLnfcJkkbzKiGDO2ZG9SXd4ZW9QcxAI69hscIQUPl1YbbTk9Vj0HKL4kR3nsD NfH8BJsAPce72iIyXl5cdsThuWIZGzpRvf2WIRHFx/y4MlMYhx0IUJGI5cw1taTzx7Cs 0XPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=7yM6JZfqBz1UkXq7Wxku4wKxCLudcA1vb16iuPF2GRc=; b=D2KLZjwIAIBLSHHmvp8ao9YNNlNh6ZmY1YZNQz+y/Mir+nROua5Ue6KG6CAh3oAXVa lYnn2eX88/xgAFDhiXti6d57CHw8ZSO9bBOeRfYw3P9oLTwypVLlfy882tigFUmZaN1U dfvlVqN2/AZacAXSz+p9qTybIizmkbI4ze0hKqSzTTe8TsufqCO851EnqV1f1XfajC0J CszGuGeYaPog5fyr9qPg+f3ps6wVnElDP4rgWS+sBVZCxYGKZDZkRppXF13wxJH8Ju0k DVDHbj0f/lMja/QrOtN4RfIfVbtLX+OFOyIS8Lh2EKu6sbPL8q03413EC4YScjDo22MQ 7Apg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b4-20020a17090a9bc400b001beeb215a7fsi411531pjw.0.2022.03.14.09.30.55; Mon, 14 Mar 2022 09:31:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233270AbiCMRX0 (ORCPT + 99 others); Sun, 13 Mar 2022 13:23:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232165AbiCMRXY (ORCPT ); Sun, 13 Mar 2022 13:23:24 -0400 Received: from out30-45.freemail.mail.aliyun.com (out30-45.freemail.mail.aliyun.com [115.124.30.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E336D52B2F; Sun, 13 Mar 2022 10:22:15 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=wenyang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0V711zhe_1647192119; Received: from localhost(mailfrom:wenyang@linux.alibaba.com fp:SMTPD_---0V711zhe_1647192119) by smtp.aliyun-inc.com(127.0.0.1); Mon, 14 Mar 2022 01:22:11 +0800 From: Wen Yang To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Thomas Gleixner Cc: Wen Yang , Stephane Eranian , Mark Rutland , Jiri Olsa , Namhyung Kim , Borislav Petkov , x86@kernel.org, Wen Yang , "H. Peter Anvin" , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RESEND PATCH v2 1/3] perf/x86: extract code to assign perf events for both core and uncore Date: Mon, 14 Mar 2022 01:21:42 +0800 Message-Id: <20220313172144.78141-1-simon.wy@alibaba-inc.com> X-Mailer: git-send-email 2.23.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Following two patterns in x86 perf code are used in multiple places where similar code is duplicated: - fast path, try to reuse previous register - slow path, assign a counter for each event In order to improve code quality and prepare for following patch series that also uses described patterns, extract the codes to perf_assign_events. This commit doesn't change functionality. Signed-off-by: Wen Yang Cc: Peter Zijlstra (Intel) Cc: Stephane Eranian Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x86@kernel.org Cc: Wen Yang Cc: "H. Peter Anvin" Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/events/core.c | 141 ++++++++++++++++++++++------------------- arch/x86/events/intel/uncore.c | 31 +-------- arch/x86/events/perf_event.h | 6 +- 3 files changed, 82 insertions(+), 96 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index e686c5e..b14fb1b 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -950,10 +950,7 @@ static bool perf_sched_next_event(struct perf_sched *sched) return true; } -/* - * Assign a counter for each event. - */ -int perf_assign_events(struct event_constraint **constraints, int n, +static int __perf_assign_events(struct event_constraint **constraints, int n, int wmin, int wmax, int gpmax, int *assign) { struct perf_sched sched; @@ -969,16 +966,66 @@ int perf_assign_events(struct event_constraint **constraints, int n, return sched.state.unassigned; } + +/* + * Assign a counter for each event. + */ +int perf_assign_events(struct perf_event **event_list, + struct event_constraint **constraints, int n, + int wmin, int wmax, int gpmax, int *assign) +{ + struct event_constraint *c; + struct hw_perf_event *hwc; + u64 used_mask = 0; + int unsched = 0; + int i; + + /* + * fastpath, try to reuse previous register + */ + for (i = 0; i < n; i++) { + u64 mask; + + hwc = &event_list[i]->hw; + c = constraints[i]; + + /* never assigned */ + if (hwc->idx == -1) + break; + + /* constraint still honored */ + if (!test_bit(hwc->idx, c->idxmsk)) + break; + + mask = BIT_ULL(hwc->idx); + if (is_counter_pair(hwc)) + mask |= mask << 1; + + /* not already used */ + if (used_mask & mask) + break; + + used_mask |= mask; + + if (assign) + assign[i] = hwc->idx; + } + + /* slow path */ + if (i != n) + unsched = __perf_assign_events(constraints, n, + wmin, wmax, gpmax, assign); + + return unsched; +} EXPORT_SYMBOL_GPL(perf_assign_events); int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) { int num_counters = hybrid(cpuc->pmu, num_counters); - struct event_constraint *c; - struct perf_event *e; int n0, i, wmin, wmax, unsched = 0; - struct hw_perf_event *hwc; - u64 used_mask = 0; + struct event_constraint *c; + int gpmax = num_counters; /* * Compute the number of events already present; see x86_pmu_add(), @@ -1017,66 +1064,30 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) } /* - * fastpath, try to reuse previous register + * Do not allow scheduling of more than half the available + * generic counters. + * + * This helps avoid counter starvation of sibling thread by + * ensuring at most half the counters cannot be in exclusive + * mode. There is no designated counters for the limits. Any + * N/2 counters can be used. This helps with events with + * specific counter constraints. */ - for (i = 0; i < n; i++) { - u64 mask; - - hwc = &cpuc->event_list[i]->hw; - c = cpuc->event_constraint[i]; - - /* never assigned */ - if (hwc->idx == -1) - break; - - /* constraint still honored */ - if (!test_bit(hwc->idx, c->idxmsk)) - break; - - mask = BIT_ULL(hwc->idx); - if (is_counter_pair(hwc)) - mask |= mask << 1; - - /* not already used */ - if (used_mask & mask) - break; + if (is_ht_workaround_enabled() && !cpuc->is_fake && + READ_ONCE(cpuc->excl_cntrs->exclusive_present)) + gpmax /= 2; - used_mask |= mask; - - if (assign) - assign[i] = hwc->idx; + /* + * Reduce the amount of available counters to allow fitting + * the extra Merge events needed by large increment events. + */ + if (x86_pmu.flags & PMU_FL_PAIR) { + gpmax = num_counters - cpuc->n_pair; + WARN_ON(gpmax <= 0); } - /* slow path */ - if (i != n) { - int gpmax = num_counters; - - /* - * Do not allow scheduling of more than half the available - * generic counters. - * - * This helps avoid counter starvation of sibling thread by - * ensuring at most half the counters cannot be in exclusive - * mode. There is no designated counters for the limits. Any - * N/2 counters can be used. This helps with events with - * specific counter constraints. - */ - if (is_ht_workaround_enabled() && !cpuc->is_fake && - READ_ONCE(cpuc->excl_cntrs->exclusive_present)) - gpmax /= 2; - - /* - * Reduce the amount of available counters to allow fitting - * the extra Merge events needed by large increment events. - */ - if (x86_pmu.flags & PMU_FL_PAIR) { - gpmax = num_counters - cpuc->n_pair; - WARN_ON(gpmax <= 0); - } - - unsched = perf_assign_events(cpuc->event_constraint, n, wmin, - wmax, gpmax, assign); - } + unsched = perf_assign_events(cpuc->event_list, cpuc->event_constraint, + n, wmin, wmax, gpmax, assign); /* * In case of success (unsched = 0), mark events as committed, @@ -1093,7 +1104,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) static_call_cond(x86_pmu_commit_scheduling)(cpuc, i, assign[i]); } else { for (i = n0; i < n; i++) { - e = cpuc->event_list[i]; + struct perf_event *e = cpuc->event_list[i]; /* * release events that failed scheduling diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index e497da9..101358a 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -442,12 +442,8 @@ static void uncore_put_event_constraint(struct intel_uncore_box *box, static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int n) { - unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)]; struct event_constraint *c; int i, wmin, wmax, ret = 0; - struct hw_perf_event *hwc; - - bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX); for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) { c = uncore_get_event_constraint(box, box->event_list[i]); @@ -456,31 +452,8 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int wmax = max(wmax, c->weight); } - /* fastpath, try to reuse previous register */ - for (i = 0; i < n; i++) { - hwc = &box->event_list[i]->hw; - c = box->event_constraint[i]; - - /* never assigned */ - if (hwc->idx == -1) - break; - - /* constraint still honored */ - if (!test_bit(hwc->idx, c->idxmsk)) - break; - - /* not already used */ - if (test_bit(hwc->idx, used_mask)) - break; - - __set_bit(hwc->idx, used_mask); - if (assign) - assign[i] = hwc->idx; - } - /* slow path */ - if (i != n) - ret = perf_assign_events(box->event_constraint, n, - wmin, wmax, n, assign); + ret = perf_assign_events(box->event_list, + box->event_constraint, n, wmin, wmax, n, assign); if (!assign || ret) { for (i = 0; i < n; i++) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 150261d..f1acd1d 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1130,8 +1130,10 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc, void x86_pmu_enable_all(int added); -int perf_assign_events(struct event_constraint **constraints, int n, - int wmin, int wmax, int gpmax, int *assign); +int perf_assign_events(struct perf_event **event_list, + struct event_constraint **constraints, int n, + int wmin, int wmax, int gpmax, int *assign); + int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign); void x86_pmu_stop(struct perf_event *event, int flags); -- 1.8.3.1