Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp614740pxm; Thu, 3 Mar 2022 00:41:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJxN57AXugTq7y88GZnZlg5DgYdjSFtANf0NI7jJ0AziDVFR7TQ5T4TKenziWLO8sV/MTyXX X-Received: by 2002:a17:902:e804:b0:151:50e6:d444 with SMTP id u4-20020a170902e80400b0015150e6d444mr24498731plg.141.1646296906919; Thu, 03 Mar 2022 00:41:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646296906; cv=none; d=google.com; s=arc-20160816; b=JQVN0O52FnzjxTfH0LoO/GAkznqJT5Wau0wUrAfkYlTO2FNh7vhrbQYq0ZlVFnrbVn tUyWNPciSlYIuBP6l3ni8LPja72ZwoPOABr/iequ1WdadS39jRkQl/zCcCyOsbMSkfBs Vn2q60RRO/f/qEdFRHm3/Gvc9Cj24ijc//jb6ZrDKjkAn5Y4NtpfnsA3Oux74COIHxFC TTy9Ptke5ai9w4KPXxA6bNhV9Tc0SXPak39BO7emqVkt5I2LGJrCrKi3MDm6O4Ki0oH/ tgBlCPl6fc3CUicNllakxCZd4PWmaHzcKY493hQ+M+ZopEgQqeOfq9UE9SuN+Ypw7Fet YD7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=GOBKdmQ03KKUG/ZgEGovAI7Ow7u457Cy6YdDzpHYMhY=; b=oGOwLjeGn2J4qaoatBV1MXctvToMyd2unSh9dRf1FU6nytyn6kVywOQw+7onhZ40WY wtUwGIvWNZPUlO9ETgIYsk3GmkEGs8zANrdka3Q/83zYHYpvVfbicGc4S1/aPCfR5UQL 7nEzcvqQIX/C2o4zyxpcy50qu1ppt2UEmjQh8MGc2OEXFVBdI1T9dDW0dXoW/7i5WM5T jwYqSUIGWxGqJqpSjLZ6iO1qaVJN0kOfrRw49OJ07k62Az97Wie0f4xn3D5iWfaL8D+U 1GAFgkXJEs+5KIiIgqSh3+nW31WVeHcQ9nFTQT0vN+ZTz+EpirAyPv9FT7F5Vz/nNX6e 4aRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i12-20020a17090332cc00b0015194cb58f0si1577768plr.531.2022.03.03.00.41.32; Thu, 03 Mar 2022 00:41:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230504AbiCCI1Y (ORCPT + 99 others); Thu, 3 Mar 2022 03:27:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45964 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229486AbiCCI1W (ORCPT ); Thu, 3 Mar 2022 03:27:22 -0500 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFE66D64F0; Thu, 3 Mar 2022 00:26:36 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=wenyang@linux.alibaba.com;NM=1;PH=DS;RN=15;SR=0;TI=SMTPD_---0V66rEBX_1646295984; Received: from localhost(mailfrom:wenyang@linux.alibaba.com fp:SMTPD_---0V66rEBX_1646295984) by smtp.aliyun-inc.com(127.0.0.1); Thu, 03 Mar 2022 16:26:33 +0800 From: Wen Yang To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Thomas Gleixner Cc: Wen Yang , Mark Rutland , Jiri Olsa , Namhyung Kim , Borislav Petkov , x86@kernel.org, Wen Yang , "H. Peter Anvin" , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] perf/x86: extract code to assign perf events for both core and uncore Date: Thu, 3 Mar 2022 16:26:21 +0800 Message-Id: <20220303082622.32847-1-simon.wy@alibaba-inc.com> X-Mailer: git-send-email 2.23.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.2 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Following two patterns in x86 perf code are used in multiple places where similar code is duplicated: - fast path, try to reuse previous register - slow path, assign a counter for each event In order to reduce duplicate and prepare for following patch series that also uses described patterns, extract the codes to perf_assign_events. This commit doesn't change functionality. Signed-off-by: Wen Yang Cc: Peter Zijlstra (Intel) Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x86@kernel.org Cc: Wen Yang Cc: "H. Peter Anvin" Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/events/core.c | 141 ++++++++++++++++++--------------- arch/x86/events/intel/uncore.c | 31 +------- arch/x86/events/perf_event.h | 6 +- 3 files changed, 82 insertions(+), 96 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index eef816fc216d..9846d422f06d 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -950,10 +950,7 @@ static bool perf_sched_next_event(struct perf_sched *sched) return true; } -/* - * Assign a counter for each event. - */ -int perf_assign_events(struct event_constraint **constraints, int n, +int _perf_assign_events(struct event_constraint **constraints, int n, int wmin, int wmax, int gpmax, int *assign) { struct perf_sched sched; @@ -969,16 +966,66 @@ int perf_assign_events(struct event_constraint **constraints, int n, return sched.state.unassigned; } + +/* + * Assign a counter for each event. + */ +int perf_assign_events(struct perf_event **event_list, + struct event_constraint **constraints, int n, + int wmin, int wmax, int gpmax, int *assign) +{ + struct event_constraint *c; + struct hw_perf_event *hwc; + u64 used_mask = 0; + int unsched = 0; + int i; + + /* + * fastpath, try to reuse previous register + */ + for (i = 0; i < n; i++) { + u64 mask; + + hwc = &event_list[i]->hw; + c = constraints[i]; + + /* never assigned */ + if (hwc->idx == -1) + break; + + /* constraint still honored */ + if (!test_bit(hwc->idx, c->idxmsk)) + break; + + mask = BIT_ULL(hwc->idx); + if (is_counter_pair(hwc)) + mask |= mask << 1; + + /* not already used */ + if (used_mask & mask) + break; + + used_mask |= mask; + + if (assign) + assign[i] = hwc->idx; + } + + /* slow path */ + if (i != n) + unsched = _perf_assign_events(constraints, n, + wmin, wmax, gpmax, assign); + + return unsched; +} EXPORT_SYMBOL_GPL(perf_assign_events); int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) { int num_counters = hybrid(cpuc->pmu, num_counters); - struct event_constraint *c; - struct perf_event *e; int n0, i, wmin, wmax, unsched = 0; - struct hw_perf_event *hwc; - u64 used_mask = 0; + struct event_constraint *c; + int gpmax = num_counters; /* * Compute the number of events already present; see x86_pmu_add(), @@ -1017,66 +1064,30 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) } /* - * fastpath, try to reuse previous register + * Do not allow scheduling of more than half the available + * generic counters. + * + * This helps avoid counter starvation of sibling thread by + * ensuring at most half the counters cannot be in exclusive + * mode. There is no designated counters for the limits. Any + * N/2 counters can be used. This helps with events with + * specific counter constraints. */ - for (i = 0; i < n; i++) { - u64 mask; - - hwc = &cpuc->event_list[i]->hw; - c = cpuc->event_constraint[i]; - - /* never assigned */ - if (hwc->idx == -1) - break; - - /* constraint still honored */ - if (!test_bit(hwc->idx, c->idxmsk)) - break; - - mask = BIT_ULL(hwc->idx); - if (is_counter_pair(hwc)) - mask |= mask << 1; - - /* not already used */ - if (used_mask & mask) - break; + if (is_ht_workaround_enabled() && !cpuc->is_fake && + READ_ONCE(cpuc->excl_cntrs->exclusive_present)) + gpmax /= 2; - used_mask |= mask; - - if (assign) - assign[i] = hwc->idx; + /* + * Reduce the amount of available counters to allow fitting + * the extra Merge events needed by large increment events. + */ + if (x86_pmu.flags & PMU_FL_PAIR) { + gpmax = num_counters - cpuc->n_pair; + WARN_ON(gpmax <= 0); } - /* slow path */ - if (i != n) { - int gpmax = num_counters; - - /* - * Do not allow scheduling of more than half the available - * generic counters. - * - * This helps avoid counter starvation of sibling thread by - * ensuring at most half the counters cannot be in exclusive - * mode. There is no designated counters for the limits. Any - * N/2 counters can be used. This helps with events with - * specific counter constraints. - */ - if (is_ht_workaround_enabled() && !cpuc->is_fake && - READ_ONCE(cpuc->excl_cntrs->exclusive_present)) - gpmax /= 2; - - /* - * Reduce the amount of available counters to allow fitting - * the extra Merge events needed by large increment events. - */ - if (x86_pmu.flags & PMU_FL_PAIR) { - gpmax = num_counters - cpuc->n_pair; - WARN_ON(gpmax <= 0); - } - - unsched = perf_assign_events(cpuc->event_constraint, n, wmin, - wmax, gpmax, assign); - } + unsched = perf_assign_events(cpuc->event_list, cpuc->event_constraint, + n, wmin, wmax, gpmax, assign); /* * In case of success (unsched = 0), mark events as committed, @@ -1093,7 +1104,7 @@ int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign) static_call_cond(x86_pmu_commit_scheduling)(cpuc, i, assign[i]); } else { for (i = n0; i < n; i++) { - e = cpuc->event_list[i]; + struct perf_event *e = cpuc->event_list[i]; /* * release events that failed scheduling diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c index e497da9bf427..101358ae2814 100644 --- a/arch/x86/events/intel/uncore.c +++ b/arch/x86/events/intel/uncore.c @@ -442,12 +442,8 @@ static void uncore_put_event_constraint(struct intel_uncore_box *box, static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int n) { - unsigned long used_mask[BITS_TO_LONGS(UNCORE_PMC_IDX_MAX)]; struct event_constraint *c; int i, wmin, wmax, ret = 0; - struct hw_perf_event *hwc; - - bitmap_zero(used_mask, UNCORE_PMC_IDX_MAX); for (i = 0, wmin = UNCORE_PMC_IDX_MAX, wmax = 0; i < n; i++) { c = uncore_get_event_constraint(box, box->event_list[i]); @@ -456,31 +452,8 @@ static int uncore_assign_events(struct intel_uncore_box *box, int assign[], int wmax = max(wmax, c->weight); } - /* fastpath, try to reuse previous register */ - for (i = 0; i < n; i++) { - hwc = &box->event_list[i]->hw; - c = box->event_constraint[i]; - - /* never assigned */ - if (hwc->idx == -1) - break; - - /* constraint still honored */ - if (!test_bit(hwc->idx, c->idxmsk)) - break; - - /* not already used */ - if (test_bit(hwc->idx, used_mask)) - break; - - __set_bit(hwc->idx, used_mask); - if (assign) - assign[i] = hwc->idx; - } - /* slow path */ - if (i != n) - ret = perf_assign_events(box->event_constraint, n, - wmin, wmax, n, assign); + ret = perf_assign_events(box->event_list, + box->event_constraint, n, wmin, wmax, n, assign); if (!assign || ret) { for (i = 0; i < n; i++) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 150261d929b9..f1acd1ded001 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1130,8 +1130,10 @@ static inline void __x86_pmu_enable_event(struct hw_perf_event *hwc, void x86_pmu_enable_all(int added); -int perf_assign_events(struct event_constraint **constraints, int n, - int wmin, int wmax, int gpmax, int *assign); +int perf_assign_events(struct perf_event **event_list, + struct event_constraint **constraints, int n, + int wmin, int wmax, int gpmax, int *assign); + int x86_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign); void x86_pmu_stop(struct perf_event *event, int flags); -- 2.19.1.6.gb485710b