Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp3830511iog; Tue, 28 Jun 2022 03:47:28 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vUAxkYRhevh9HjN/OfjwYOzaH4TM43C2Ws5XnxcOFKVLTy9bUleU7akPhPXDHNteRMg2AQ X-Received: by 2002:a17:907:6e8a:b0:726:b8c9:547a with SMTP id sh10-20020a1709076e8a00b00726b8c9547amr6904856ejc.27.1656413248396; Tue, 28 Jun 2022 03:47:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656413248; cv=none; d=google.com; s=arc-20160816; b=DBgdKdDNAKF4eYDmQ7FBvyXaPfKY747OyxdUudt/ycyXVmA+CUHaEI4ahkbDVE41Ec b5cOLG+lQOaw2Rqs5V1K8llyL8M/6BAkF1e2Tqt1ZobwlAeK2JL69fwmk0WROmYbrGpL odnL2aaFyPCWLY5vsfRP2w+8PBV69/m8fLwp2KU3eDviN2+rhT+XZqZf4AS9ap7dRsVE AgOc4WI9zTfNibRd2wzvumAa+IU8g7NFqurnUbGUfoWuRWv5tnhT0s/tk11vfAZCFFdV +dIHMkXE4uoWCufkRG1IBHda/UUbMFNTcjIqiIOR/RZbOQIr+SDkfnzJZuQNIY1QLnsW 1F6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=rbZIw4RHxWyjZ7uKCIj1/02nzDfS7BCWoBx2cEn1z+M=; b=KSLIhxgV7Oe0q8dsoldesW7s2j0jnY+zG0ShqNyViGN7a2q8xrm3XarPqaDRdDy97E rox5g07ubCzPvlPEwbESVXpmkZwP/zrhV1vRSjkOpFCHk27vLqKPo3vS1Rk7K8O0Yczm EEIQ/B6HUIUZKTUT9OXhEUhY4oRqSE3wiBz6NfjsN8wbjwataorpwlQmL84HvATHhJfh ha0Gx9k3bd4WS+8d/Lw8rsVHZLIdJ0Yn6XERETKZGKl7P948Yoz7fLzQPDfA6z8Gmq7U HSfXe5XkK5O8j4IY95GWmWTh1/3cokeSUmGMkoYp4VJMtuywTuJFFb66uIQm0tE/ZATi VG8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=oiApGXN2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a17-20020a509b51000000b00435660ad459si3757824edj.518.2022.06.28.03.47.01; Tue, 28 Jun 2022 03:47:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=oiApGXN2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344224AbiF1J7e (ORCPT + 99 others); Tue, 28 Jun 2022 05:59:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344591AbiF1J7S (ORCPT ); Tue, 28 Jun 2022 05:59:18 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3BCF2EA2E for ; Tue, 28 Jun 2022 02:59:13 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id i9-20020a05640242c900b004373cd1c4d5so8374304edc.2 for ; Tue, 28 Jun 2022 02:59:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=rbZIw4RHxWyjZ7uKCIj1/02nzDfS7BCWoBx2cEn1z+M=; b=oiApGXN2/9s/sFVoceixDAIut228b43B7re12xpbnUERFPtk3h5VrnONGsgEsWqxUz CepjzbassXmVJ76eDCy67Z+EgyN3bvcPONjJ+CWggNnKI1jYdXilj8LMD68RmZUcI4kU 4tqs4HcLOZ/p/oZYzlelGpHvLA7w8PNYpHsGk4RodfygCVdyL2oOVwU9Y4V/XXeqmidi bJp1KeFiUgwqJPXK9L2YbiErnJvtRE5/3evOUmd3hE/AO+u/L87tyGphXzsKbtbQPHOp k5xHN+MGFJMTjXOcyrqvDgFAs5/10AcftdO5la5+DNdTRvbLsWgGFquqjZKQX7ZywfYd cFVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=rbZIw4RHxWyjZ7uKCIj1/02nzDfS7BCWoBx2cEn1z+M=; b=MekJ3z+kVS5U4/IxEHuLEPWfIlswwYAyFt9IrEck920lG9XeZE7Pmg+CHHp+zXRD+P JNa3E3LxBJ71Jmv9LAy1Hdrw5lhICKBjm2Cw8e20ckPVJtyZA+2S9yVKq7V6zKwRlvdF T8tKRm40JAe3YWxIgabMaB8oL7pysnA3rpj/bJLPAycxe11Dw8VweimqkBay+2c1aF2L r/Y8lM5zfNTiNe5f/Hxf6mNq5a9Cjj9uDiw/i4LLnc2faCWJlYFBPmIgHgro08rJGnce 1iBD/QESgbEb4mGhCUQrhmSoW9ne7MzDGTEerbK+ml8+ESwx8HxbO1aG2UNCBE0akr7V mJaA== X-Gm-Message-State: AJIora8bkuiUNNH8KX18Us0QJVKvCxJUYOylp4gUtSvsRSw4h99BtX5S SkCiqFOkTrls+cMEQPn9a5FSpPDY4Q== X-Received: from elver.muc.corp.google.com ([2a00:79e0:9c:201:3496:744e:315a:b41b]) (user=elver job=sendgmr) by 2002:a05:6402:268a:b0:435:c137:6452 with SMTP id w10-20020a056402268a00b00435c1376452mr22004368edd.419.1656410352317; Tue, 28 Jun 2022 02:59:12 -0700 (PDT) Date: Tue, 28 Jun 2022 11:58:25 +0200 In-Reply-To: <20220628095833.2579903-1-elver@google.com> Message-Id: <20220628095833.2579903-6-elver@google.com> Mime-Version: 1.0 References: <20220628095833.2579903-1-elver@google.com> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog Subject: [PATCH v2 05/13] perf/hw_breakpoint: Optimize constant number of breakpoint slots From: Marco Elver To: elver@google.com, Peter Zijlstra , Frederic Weisbecker , Ingo Molnar Cc: Thomas Gleixner , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Dmitry Vyukov , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-perf-users@vger.kernel.org, x86@kernel.org, linux-sh@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Optimize internal hw_breakpoint state if the architecture's number of breakpoint slots is constant. This avoids several kmalloc() calls and potentially unnecessary failures if the allocations fail, as well as subtly improves code generation and cache locality. The protocol is that if an architecture defines hw_breakpoint_slots via the preprocessor, it must be constant and the same for all types. Signed-off-by: Marco Elver Acked-by: Dmitry Vyukov --- arch/sh/include/asm/hw_breakpoint.h | 5 +- arch/x86/include/asm/hw_breakpoint.h | 5 +- kernel/events/hw_breakpoint.c | 92 ++++++++++++++++++---------- 3 files changed, 62 insertions(+), 40 deletions(-) diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h index 199d17b765f2..361a0f57bdeb 100644 --- a/arch/sh/include/asm/hw_breakpoint.h +++ b/arch/sh/include/asm/hw_breakpoint.h @@ -48,10 +48,7 @@ struct pmu; /* Maximum number of UBC channels */ #define HBP_NUM 2 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) /* arch/sh/kernel/hw_breakpoint.c */ extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw); diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h index a1f0e90d0818..0bc931cd0698 100644 --- a/arch/x86/include/asm/hw_breakpoint.h +++ b/arch/x86/include/asm/hw_breakpoint.h @@ -44,10 +44,7 @@ struct arch_hw_breakpoint { /* Total number of available HW breakpoint registers */ #define HBP_NUM 4 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) struct perf_event_attr; struct perf_event; diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index 270be965f829..a089302ddf59 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -40,13 +40,16 @@ struct bp_cpuinfo { /* Number of pinned cpu breakpoints in a cpu */ unsigned int cpu_pinned; /* tsk_pinned[n] is the number of tasks having n+1 breakpoints */ +#ifdef hw_breakpoint_slots + unsigned int tsk_pinned[hw_breakpoint_slots(0)]; +#else unsigned int *tsk_pinned; +#endif /* Number of non-pinned cpu/task breakpoints in a cpu */ unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ }; static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); -static int nr_slots[TYPE_MAX] __ro_after_init; static struct bp_cpuinfo *get_bp_info(int cpu, enum bp_type_idx type) { @@ -73,6 +76,54 @@ struct bp_busy_slots { /* Serialize accesses to the above constraints */ static DEFINE_MUTEX(nr_bp_mutex); +#ifdef hw_breakpoint_slots +/* + * Number of breakpoint slots is constant, and the same for all types. + */ +static_assert(hw_breakpoint_slots(TYPE_INST) == hw_breakpoint_slots(TYPE_DATA)); +static inline int hw_breakpoint_slots_cached(int type) { return hw_breakpoint_slots(type); } +static inline int init_breakpoint_slots(void) { return 0; } +#else +/* + * Dynamic number of breakpoint slots. + */ +static int __nr_bp_slots[TYPE_MAX] __ro_after_init; + +static inline int hw_breakpoint_slots_cached(int type) +{ + return __nr_bp_slots[type]; +} + +static __init int init_breakpoint_slots(void) +{ + int i, cpu, err_cpu; + + for (i = 0; i < TYPE_MAX; i++) + __nr_bp_slots[i] = hw_breakpoint_slots(i); + + for_each_possible_cpu(cpu) { + for (i = 0; i < TYPE_MAX; i++) { + struct bp_cpuinfo *info = get_bp_info(cpu, i); + + info->tsk_pinned = kcalloc(__nr_bp_slots[i], sizeof(int), GFP_KERNEL); + if (!info->tsk_pinned) + goto err; + } + } + + return 0; +err: + for_each_possible_cpu(err_cpu) { + for (i = 0; i < TYPE_MAX; i++) + kfree(get_bp_info(err_cpu, i)->tsk_pinned); + if (err_cpu == cpu) + break; + } + + return -ENOMEM; +} +#endif + __weak int hw_breakpoint_weight(struct perf_event *bp) { return 1; @@ -95,7 +146,7 @@ static unsigned int max_task_bp_pinned(int cpu, enum bp_type_idx type) unsigned int *tsk_pinned = get_bp_info(cpu, type)->tsk_pinned; int i; - for (i = nr_slots[type] - 1; i >= 0; i--) { + for (i = hw_breakpoint_slots_cached(type) - 1; i >= 0; i--) { if (tsk_pinned[i] > 0) return i + 1; } @@ -312,7 +363,7 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) fetch_this_slot(&slots, weight); /* Flexible counters need to keep at least one slot */ - if (slots.pinned + (!!slots.flexible) > nr_slots[type]) + if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) return -ENOSPC; ret = arch_reserve_bp_slot(bp); @@ -687,42 +738,19 @@ static struct pmu perf_breakpoint = { int __init init_hw_breakpoint(void) { - int cpu, err_cpu; - int i, ret; - - for (i = 0; i < TYPE_MAX; i++) - nr_slots[i] = hw_breakpoint_slots(i); - - for_each_possible_cpu(cpu) { - for (i = 0; i < TYPE_MAX; i++) { - struct bp_cpuinfo *info = get_bp_info(cpu, i); - - info->tsk_pinned = kcalloc(nr_slots[i], sizeof(int), - GFP_KERNEL); - if (!info->tsk_pinned) { - ret = -ENOMEM; - goto err; - } - } - } + int ret; ret = rhltable_init(&task_bps_ht, &task_bps_ht_params); if (ret) - goto err; + return ret; + + ret = init_breakpoint_slots(); + if (ret) + return ret; constraints_initialized = true; perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT); return register_die_notifier(&hw_breakpoint_exceptions_nb); - -err: - for_each_possible_cpu(err_cpu) { - for (i = 0; i < TYPE_MAX; i++) - kfree(get_bp_info(err_cpu, i)->tsk_pinned); - if (err_cpu == cpu) - break; - } - - return ret; } -- 2.37.0.rc0.161.g10f37bed90-goog