Received: by 2002:a5d:925a:0:0:0:0:0 with SMTP id e26csp342611iol; Thu, 9 Jun 2022 05:07:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzF3STMIr2UxwzlixXvlOS0Bjn0s3GiAdTnEIjbBKJp31LUT5cu207OemEST2+TsZ0LrRmX X-Received: by 2002:a17:907:962a:b0:711:d519:5ae3 with SMTP id gb42-20020a170907962a00b00711d5195ae3mr16986306ejc.711.1654776443873; Thu, 09 Jun 2022 05:07:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654776443; cv=none; d=google.com; s=arc-20160816; b=fO1Oj2+TuInNEmPwV2f6eQykzkkS/pM+2MM1MEb5f7tSqA1syZpK5RdQgHfa90xnNZ bq7a0UfmWkoz25pSgW/+Xofa2axKCqkFy6QAA+YGXpq+4D4d6mJ9ioWaxhyFSjAD3TNi 3FW0KPmi3U4MnoSf8mPooeqmIWO7R7EYGBr77X8mEdBSkOzs0vCk6QwkLXcjEZZc6FFp VQkypASKoHyd0xs3gK2gvsPa7phrhTuqDXuONq9ZLiBgTBbhw9wkek3qpSLE9SKHBByo c4zO2nU7zoPygI3qTPkkiJKExOuwGJKpm1jQxtdlQRBQoXcvCd4+uNWJVUG0tO4tgUl/ uaDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=/f8j1JHjR27jI6H3SEM6tLzJpymbckkYBJ4SkEYwIq8=; b=Ba2S+zep+Ou5KtVrpPbtRXMea/ZDHoX39gGXwfJBlLXgd07NbBdam9dCq7N5auhPbn HI9r3sXdc1wv+68UGRN4augBT9mGu3JgBjzUPFnuqyqTywJ1fEL3/imeLxS2iNRRnJyx lmD/QeZRfLeMsKWYJr2YHMD4eRq4nPAkdtVkMRnKWBxiIi2G/FqEaNvEn2oIgsX+G9lW kcCr7VPv93blPTIGTtdbQO9pVAK5HjZ3oXPSA7Q9Zo4zOOAmhG4bnW1VaXdV5C0aX6iJ 8GExgfMkyNpBRurZpsPN7oA99Kq0sXFNpqefwdFst/y4FKOKlbgnk+fhTS9/vzYZSl6a /5iw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LyoQ4FOM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ht12-20020a170907608c00b0071191ac35d5si2322243ejc.439.2022.06.09.05.06.47; Thu, 09 Jun 2022 05:07:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=LyoQ4FOM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243874AbiFILbP (ORCPT + 99 others); Thu, 9 Jun 2022 07:31:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243689AbiFILbL (ORCPT ); Thu, 9 Jun 2022 07:31:11 -0400 Received: from mail-ej1-x649.google.com (mail-ej1-x649.google.com [IPv6:2a00:1450:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C4653A0E67 for ; Thu, 9 Jun 2022 04:31:07 -0700 (PDT) Received: by mail-ej1-x649.google.com with SMTP id k7-20020a1709062a4700b006fe92440164so10873684eje.23 for ; Thu, 09 Jun 2022 04:31:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=/f8j1JHjR27jI6H3SEM6tLzJpymbckkYBJ4SkEYwIq8=; b=LyoQ4FOMRhwn7n2IlecHdLZrsM1pYFG44BCJ7EDP/LFEezIyF4PbZZ6L1gdRa001qC gw2hQCy+e0sg4D+cUQ3GzF6eB0VOUCv9WMksfhBhyYQnD3a9CkO2lm6kEzfgPZr9rTlF c80CDzbsJULj/dfKYnJP53KLqTcoebAW+fICFgoGajhBYeTTOW2NSlhKF/FnPunjWzOt F5a36paLV6Sish3iefODRzGpipMD/NulwT5MQccf/AY/tMHD02muEFSgTzdxDmztdpxx /B3wgaH4brTDN/XrIX6759k0C4mRkf50lmQEESwwoSOdyVjBTWd9xYOwruYzwVoSqMuJ d5pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/f8j1JHjR27jI6H3SEM6tLzJpymbckkYBJ4SkEYwIq8=; b=uwvthHM8sZ8o2bEjSHvdW0iGrr2Z2t6DcbPI4mFdSMjirij1neMU/9VMkM6pY/gZXi QpA1DnldL4k9l1yJKT84JwsgHS2Hwqit5rSpdnU7WIU7v+noqKB319joMJ+c0P75Jw6A FFCDAT02ekMLUI/WSo7vWQFfsO9AWBQPS/dC6eFXnwEjphNjvVsHokltNm2l7jxg2XPL tmQIMyKAFpnMGwhPbcQ5lOiGIRKAg/URw9PaGObnLoMz4iY6FutBa70/FPV01qmfWTaw 2jDH4Rn2ICBnvmHtM8mIdHZYj2OUI1sbJfq2fkeG6/TGaP+y5UYn4nGgXpUlPS8OhRHi AAjA== X-Gm-Message-State: AOAM532DVxllWl/azk1GwhaypDdVfN+iVn1ydc2dwdS5YMXb81KbBTYK ynLZMXT8hhkQvqkNLEgxDLvOAee8qA== X-Received: from elver.muc.corp.google.com ([2a00:79e0:9c:201:dcf:e5ba:10a5:1ea5]) (user=elver job=sendgmr) by 2002:a17:906:3bd9:b0:6ff:4b5:4a8f with SMTP id v25-20020a1709063bd900b006ff04b54a8fmr29080565ejf.139.1654774265622; Thu, 09 Jun 2022 04:31:05 -0700 (PDT) Date: Thu, 9 Jun 2022 13:30:41 +0200 In-Reply-To: <20220609113046.780504-1-elver@google.com> Message-Id: <20220609113046.780504-4-elver@google.com> Mime-Version: 1.0 References: <20220609113046.780504-1-elver@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH 3/8] perf/hw_breakpoint: Optimize constant number of breakpoint slots From: Marco Elver To: elver@google.com, Peter Zijlstra , Frederic Weisbecker , Ingo Molnar Cc: Thomas Gleixner , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Dmitry Vyukov , linux-perf-users@vger.kernel.org, x86@kernel.org, linux-sh@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Optimize internal hw_breakpoint state if the architecture's number of breakpoint slots is constant. This avoids several kmalloc() calls and potentially unnecessary failures if the allocations fail, as well as subtly improves code generation and cache locality. The protocol is that if an architecture defines hw_breakpoint_slots via the preprocessor, it must be constant and the same for all types. Signed-off-by: Marco Elver --- arch/sh/include/asm/hw_breakpoint.h | 5 +- arch/x86/include/asm/hw_breakpoint.h | 5 +- kernel/events/hw_breakpoint.c | 92 ++++++++++++++++++---------- 3 files changed, 62 insertions(+), 40 deletions(-) diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h index 199d17b765f2..361a0f57bdeb 100644 --- a/arch/sh/include/asm/hw_breakpoint.h +++ b/arch/sh/include/asm/hw_breakpoint.h @@ -48,10 +48,7 @@ struct pmu; /* Maximum number of UBC channels */ #define HBP_NUM 2 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) /* arch/sh/kernel/hw_breakpoint.c */ extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw); diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h index a1f0e90d0818..0bc931cd0698 100644 --- a/arch/x86/include/asm/hw_breakpoint.h +++ b/arch/x86/include/asm/hw_breakpoint.h @@ -44,10 +44,7 @@ struct arch_hw_breakpoint { /* Total number of available HW breakpoint registers */ #define HBP_NUM 4 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) struct perf_event_attr; struct perf_event; diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index 1f718745d569..8e939723f27d 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -41,13 +41,16 @@ struct bp_cpuinfo { /* Number of pinned cpu breakpoints in a cpu */ unsigned int cpu_pinned; /* tsk_pinned[n] is the number of tasks having n+1 breakpoints */ +#ifdef hw_breakpoint_slots + unsigned int tsk_pinned[hw_breakpoint_slots(0)]; +#else unsigned int *tsk_pinned; +#endif /* Number of non-pinned cpu/task breakpoints in a cpu */ unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ }; static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); -static int nr_slots[TYPE_MAX] __ro_after_init; static struct bp_cpuinfo *get_bp_info(int cpu, enum bp_type_idx type) { @@ -74,6 +77,54 @@ struct bp_busy_slots { /* Serialize accesses to the above constraints */ static DEFINE_MUTEX(nr_bp_mutex); +#ifdef hw_breakpoint_slots +/* + * Number of breakpoint slots is constant, and the same for all types. + */ +static_assert(hw_breakpoint_slots(TYPE_INST) == hw_breakpoint_slots(TYPE_DATA)); +static inline int hw_breakpoint_slots_cached(int type) { return hw_breakpoint_slots(type); } +static inline int init_breakpoint_slots(void) { return 0; } +#else +/* + * Dynamic number of breakpoint slots. + */ +static int __nr_bp_slots[TYPE_MAX] __ro_after_init; + +static inline int hw_breakpoint_slots_cached(int type) +{ + return __nr_bp_slots[type]; +} + +static __init int init_breakpoint_slots(void) +{ + int i, cpu, err_cpu; + + for (i = 0; i < TYPE_MAX; i++) + __nr_bp_slots[i] = hw_breakpoint_slots(i); + + for_each_possible_cpu(cpu) { + for (i = 0; i < TYPE_MAX; i++) { + struct bp_cpuinfo *info = get_bp_info(cpu, i); + + info->tsk_pinned = kcalloc(__nr_bp_slots[i], sizeof(int), GFP_KERNEL); + if (!info->tsk_pinned) + goto err; + } + } + + return 0; +err: + for_each_possible_cpu(err_cpu) { + for (i = 0; i < TYPE_MAX; i++) + kfree(get_bp_info(err_cpu, i)->tsk_pinned); + if (err_cpu == cpu) + break; + } + + return -ENOMEM; +} +#endif + __weak int hw_breakpoint_weight(struct perf_event *bp) { return 1; @@ -96,7 +147,7 @@ static unsigned int max_task_bp_pinned(int cpu, enum bp_type_idx type) unsigned int *tsk_pinned = get_bp_info(cpu, type)->tsk_pinned; int i; - for (i = nr_slots[type] - 1; i >= 0; i--) { + for (i = hw_breakpoint_slots_cached(type) - 1; i >= 0; i--) { if (tsk_pinned[i] > 0) return i + 1; } @@ -313,7 +364,7 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) fetch_this_slot(&slots, weight); /* Flexible counters need to keep at least one slot */ - if (slots.pinned + (!!slots.flexible) > nr_slots[type]) + if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) return -ENOSPC; ret = arch_reserve_bp_slot(bp); @@ -688,42 +739,19 @@ static struct pmu perf_breakpoint = { int __init init_hw_breakpoint(void) { - int cpu, err_cpu; - int i, ret; - - for (i = 0; i < TYPE_MAX; i++) - nr_slots[i] = hw_breakpoint_slots(i); - - for_each_possible_cpu(cpu) { - for (i = 0; i < TYPE_MAX; i++) { - struct bp_cpuinfo *info = get_bp_info(cpu, i); - - info->tsk_pinned = kcalloc(nr_slots[i], sizeof(int), - GFP_KERNEL); - if (!info->tsk_pinned) { - ret = -ENOMEM; - goto err; - } - } - } + int ret; ret = rhltable_init(&task_bps_ht, &task_bps_ht_params); if (ret) - goto err; + return ret; + + ret = init_breakpoint_slots(); + if (ret) + return ret; constraints_initialized = true; perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT); return register_die_notifier(&hw_breakpoint_exceptions_nb); - -err: - for_each_possible_cpu(err_cpu) { - for (i = 0; i < TYPE_MAX; i++) - kfree(get_bp_info(err_cpu, i)->tsk_pinned); - if (err_cpu == cpu) - break; - } - - return ret; } -- 2.36.1.255.ge46751e96f-goog