Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp2938704pxp; Mon, 14 Mar 2022 07:46:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyk5A9wnSnEgJM1z2MEZuFjrybT/S+viHjdu9EZN8kAZZNTDfGxTl5oGY9jWdpJTRhPlRWm X-Received: by 2002:a17:902:bcc6:b0:153:53c2:7e2c with SMTP id o6-20020a170902bcc600b0015353c27e2cmr9128667pls.14.1647269214581; Mon, 14 Mar 2022 07:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647269214; cv=none; d=google.com; s=arc-20160816; b=eemkGGgLZQTGqNamexnG/glwkXvDRmE0EqimFbNyiSRpR0CYfmQeY7N0nxVCCLY+V8 e76T/oBX6e7VEws3Pb7lWp0FsYxoeihlIg90f6w03ffTOPC5Xdtbi5Kp9b38+LTm//mW wqR4BbAlVbotWYlfi//82zEHZkeI40iTug9kDoTie2abAMktv+bRbGfv8D45O4/+SSJG V+vy6BZgSnMZIDFhCu4S5cuFzFv/gfcbxpI+K1hqy7dXfIc+5r8ot3ecGeiIjL8EpkMr dayoQJ1zAH9bl0sqkBO4LIHXbaH0FSSftfPsZrYd+WRQSUPI2kcjw5IIG2N83568WB0N rdYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=G3PyOesNZ1hRW0y++vTqD+HpxdI5m13htVmvSBvy+OQ=; b=wXso3UlNVBVsmdqQN46TbcZsyCZtu7g48q0112DkfQgs4nnUHambUs43vYgs5Ms8bn erMg97eCmAsoFVTf5zMLu25sTTWjBPGzlC6IFaVtf/5Ms7fPSLAmL1ptMJgbdTuoIdMi n1u3K/EvjdJ+GnTDLqNHOp+eKzLwy9AWLewJPoNo86rAz3EU/5hGANxXY7WEc3+yjxJ2 Z3/Zh1a/Oyf9R0p79NfeL86MoqB9pmzVeKy0Ky1P7aKECkAveDyZ48R2dAL/Q2ArkTMk l8ZEB9NBRTjQD5Kvt3EL13kraNqoiFW6nJpxpWuy3orQzDhwJ6X9TvBtLYt2e1JR0xXS z4PQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m12-20020a170902db0c00b00151ef9a7e0esi16215586plx.276.2022.03.14.07.46.39; Mon, 14 Mar 2022 07:46:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=alibaba-inc.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235077AbiCMQwl (ORCPT + 99 others); Sun, 13 Mar 2022 12:52:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235083AbiCMQwj (ORCPT ); Sun, 13 Mar 2022 12:52:39 -0400 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD83798F69; Sun, 13 Mar 2022 09:51:29 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=wenyang@linux.alibaba.com;NM=1;PH=DS;RN=16;SR=0;TI=SMTPD_---0V70i28O_1647190276; Received: from localhost(mailfrom:wenyang@linux.alibaba.com fp:SMTPD_---0V70i28O_1647190276) by smtp.aliyun-inc.com(127.0.0.1); Mon, 14 Mar 2022 00:51:25 +0800 From: Wen Yang To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Thomas Gleixner Cc: Wen Yang , Stephane Eranian , Mark Rutland , Jiri Olsa , Namhyung Kim , Borislav Petkov , x86@kernel.org, Wen Yang , "H. Peter Anvin" , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/3] perf/x86: reuse scarce pmu counters Date: Mon, 14 Mar 2022 00:50:47 +0800 Message-Id: <20220313165047.77391-3-simon.wy@alibaba-inc.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20220313165047.77391-1-simon.wy@alibaba-inc.com> References: <20220313165047.77391-1-simon.wy@alibaba-inc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The nmi watchdog may permanently consume a fixed counter (*cycles*), so when other programs collect *cycles* again, they will occupy a GP. Here is a slight optimization: save a generic counter for events that are non-sampling type and using a fixed counter. Signed-off-by: Wen Yang Cc: Peter Zijlstra (Intel) Cc: Stephane Eranian Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Thomas Gleixner Cc: Borislav Petkov Cc: x86@kernel.org Cc: Wen Yang Cc: "H. Peter Anvin" Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org --- arch/x86/events/core.c | 45 +++++++++++++++++++++++++++++++-------------- 1 file changed, 31 insertions(+), 14 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index b7f5925..6ddddf1 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -799,6 +799,7 @@ struct perf_sched { u64 msk_counters; u64 msk_events; struct event_constraint **constraints; + struct perf_event **events; struct sched_state state; struct sched_state saved[SCHED_STATES_MAX]; }; @@ -846,7 +847,8 @@ static int perf_sched_calc_event(struct event_constraint **constraints, /* * Initialize iterator that runs through all events and counters. */ -static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints, +static void perf_sched_init(struct perf_sched *sched, + struct perf_event **events, struct event_constraint **constraints, int num, int wmin, int wmax, int gpmax, u64 mevt, u64 mcnt) { memset(sched, 0, sizeof(*sched)); @@ -854,12 +856,13 @@ static void perf_sched_init(struct perf_sched *sched, struct event_constraint ** sched->max_weight = wmax; sched->max_gp = gpmax; sched->constraints = constraints; + sched->events = events; sched->msk_events = mevt; sched->msk_counters = mcnt; sched->state.weight = perf_sched_calc_weight(constraints, num, wmin, wmax, mcnt); sched->state.event = perf_sched_calc_event(constraints, num, sched->state.weight, mevt); - sched->state.unassigned = num - hweight_long(sched->state.event); + sched->state.unassigned = num - hweight_long(mevt); } static void perf_sched_save_state(struct perf_sched *sched) @@ -896,6 +899,7 @@ static bool perf_sched_restore_state(struct perf_sched *sched) static bool __perf_sched_find_counter(struct perf_sched *sched) { struct event_constraint *c; + struct perf_event *e; int idx; if (!sched->state.unassigned) @@ -905,16 +909,17 @@ static bool __perf_sched_find_counter(struct perf_sched *sched) return false; c = sched->constraints[sched->state.event]; + e = sched->events[sched->state.event]; /* Prefer fixed purpose counters */ if (c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED)) { idx = INTEL_PMC_IDX_FIXED; for_each_set_bit_from(idx, c->idxmsk, X86_PMC_IDX_MAX) { u64 mask = BIT_ULL(idx); - if (sched->msk_counters & mask) + if ((sched->msk_counters & mask) && is_sampling_event(e)) continue; - if (sched->state.used & mask) + if ((sched->state.used & mask) && is_sampling_event(e)) continue; sched->state.used |= mask; @@ -1016,14 +1021,15 @@ static void perf_sched_obtain_used_registers(int *assign, int n, u64 *events, u6 } } -static int __perf_assign_events(struct event_constraint **constraints, int n, +static int __perf_assign_events(struct perf_event **events, + struct event_constraint **constraints, int n, int wmin, int wmax, int gpmax, int *assign) { - u64 msk_events, msk_counters; + u64 mevt, mcnt; struct perf_sched sched; - perf_sched_obtain_used_registers(assign, n, &msk_events, &msk_counters); - perf_sched_init(&sched, constraints, n, wmin, wmax, gpmax, msk_events, msk_counters); + perf_sched_obtain_used_registers(assign, n, &mevt, &mcnt); + perf_sched_init(&sched, events, constraints, n, wmin, wmax, gpmax, mevt, mcnt); do { if (!perf_sched_find_counter(&sched)) @@ -1035,6 +1041,13 @@ static int __perf_assign_events(struct event_constraint **constraints, int n, return sched.state.unassigned; } +static bool is_pmc_reuseable(struct perf_event *e, + struct event_constraint *c) +{ + return c->idxmsk64 & (~0ULL << INTEL_PMC_IDX_FIXED) && + !is_sampling_event(e); +} + /* * Assign a counter for each event. */ @@ -1043,12 +1056,13 @@ int perf_assign_events(struct perf_event **event_list, int wmin, int wmax, int gpmax, int *assign) { struct event_constraint *c; + struct perf_event *e; struct hw_perf_event *hwc; u64 used_mask = 0; int unsched = 0; int i; - memset(assign, -1, n); + memset(assign, -1, n * sizeof(int)); /* * fastpath, try to reuse previous register @@ -1058,6 +1072,7 @@ int perf_assign_events(struct perf_event **event_list, hwc = &event_list[i]->hw; c = constraints[i]; + e = event_list[i]; /* never assigned */ if (hwc->idx == -1) @@ -1072,8 +1087,10 @@ int perf_assign_events(struct perf_event **event_list, mask |= mask << 1; /* not already used */ - if (used_mask & mask) - break; + if (used_mask & mask) { + if (!is_pmc_reuseable(e, c)) + break; + } used_mask |= mask; @@ -1083,12 +1100,12 @@ int perf_assign_events(struct perf_event **event_list, /* slow path */ if (i != n) { - unsched = __perf_assign_events(constraints, n, + unsched = __perf_assign_events(event_list, constraints, n, wmin, wmax, gpmax, assign); if (unsched) { - memset(assign, -1, n); - unsched = __perf_assign_events(constraints, n, + memset(assign, -1, n * sizeof(int)); + unsched = __perf_assign_events(event_list, constraints, n, wmin, wmax, gpmax, assign); } } -- 1.8.3.1