Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp242670imw; Mon, 4 Jul 2022 08:24:13 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vIyQJ3/O3dcCdwHo9/ai913dVMaS+M82al4ZZvG2Dz0fKcawiEYHQ9KxnL8Rku+ngpB29d X-Received: by 2002:a50:fc15:0:b0:435:7897:e8ab with SMTP id i21-20020a50fc15000000b004357897e8abmr39798564edr.17.1656948253719; Mon, 04 Jul 2022 08:24:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656948253; cv=none; d=google.com; s=arc-20160816; b=eUSGcgsMric0kyXRL8GSNu4dXS0AFCkEDpcqcCEATxPojFtMhUO4xTHRkIyn7hi89M FxkOG539/kkr4l5JaCeHKXGiq22/QWSTi/X8mhElwNTaCD1UGX1IcrcUJchrk23W7lUG qPr4gO6B9g9Jaq99u7gOjzcb0ALWZ9qPjko0DlIFR9DS/BvCe87DbkMx+Oyfv66M348E ePi+TkdioOOL9y0ITgJa3+JlyVxSTXTy+vWWArW+kGkcesZLAuPCmX2y9CS1Zi2Airmm kDF84Tq1xM1UtKZKjkbAkoYYXI1Q7KCNCoisxWDgNXmd0V1j9hEeuWf+0chOomWf3D8K CQZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:dkim-signature; bh=YlRTKvNx8tiz4oSdBAerktmQbu3PRvQhLNbba30Undg=; b=heeZ4xlAvnIBU0qB1AGcKe5MOwSO0pzUYHLkDi/XARLD/BEtHZ1hFYGfElpRBL5sax MeXaQ1bnigIGFmravLXoR2CiF2zY2SfS5iTbMAFSpJ1A75ykkPxrHs1BKE2xrSkKPW7f iWFsh1nRWk+6U0xNjiLZhmxz7yzU4QnyQSHzVEEZpJVnHXipDcYeJlQOLOpoK/AdGOLh Ku/KTceQOVZQHw97r0E4KtrWTL+C3AnUnMjT4iO0xCXFUrPfwSk6X4VfeDywRZuzUuXW eDGyhRyPICVmLRDC4uh7M6k3HncSHchEr8W1NBVVc693x6OGKBtb8LJBakbxh6uydBIx zhpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RwcvwrqX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cr18-20020a170906d55200b0072ac5da9df3si5384383ejc.223.2022.07.04.08.23.49; Mon, 04 Jul 2022 08:24:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=RwcvwrqX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234836AbiGDPGh (ORCPT + 99 others); Mon, 4 Jul 2022 11:06:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234980AbiGDPGV (ORCPT ); Mon, 4 Jul 2022 11:06:21 -0400 Received: from mail-ed1-x54a.google.com (mail-ed1-x54a.google.com [IPv6:2a00:1450:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05DBEEE15 for ; Mon, 4 Jul 2022 08:06:13 -0700 (PDT) Received: by mail-ed1-x54a.google.com with SMTP id t14-20020a056402524e00b0043595a18b91so7320261edd.13 for ; Mon, 04 Jul 2022 08:06:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=YlRTKvNx8tiz4oSdBAerktmQbu3PRvQhLNbba30Undg=; b=RwcvwrqX2z7t2fQJv2t7ZU8i5bJzwwEiH56G0yAfcgqIYyhNjOltQ2VuJR1sp2Rmbw XvEdfuZHAWqW8pgQv2hQh74RdPrP3apfHkaSxKEaYHEFjDaIreu89QnYr4zMywFb2VOi iZLOmaDC84Orb/BaTXTWadpHh3OjXibjX0upfBxIjDrXwHp3NSRJCiumrQI+GDwEp64z acmqL+4q05OpyzdgTFBVLODf8di+RCMVRfMTpr5YEqcqUIIeDd/c499CdFxksqydldm5 mqDXRghRutEKYkvLb9yRg3kLXD40EzrZJa1jPl1g1XpprE4UXWt9RbbDMVmpdQRxvNjV BP+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=YlRTKvNx8tiz4oSdBAerktmQbu3PRvQhLNbba30Undg=; b=ACd+t6e9bo2so00WWUOcxGmorcOSpLYKChu+u+C26cT5k9QTWAQobEDvZ57aiVIuqH vtRmLCYfO+OKrSajaKCro6tvWnb1S2Snlqw4OPQD+gdWoJZN55fsO0Em0CpWTNb0l3lL PUrjy6uSICNRiod4B1Tvm4zxJp4UqsLcj7X6Kdpjs6nylRyyeOhlMWPwKOB0OIWsFvah 2aoBAuTmAzcrkoTwBpE1G9Xrkx9GWblJvk72h2tE6XCmCvK+2kVIG/yZE7gFR94YcKEa gt/33WEPbxZxTDq4nP1UfyU5Dj3PSFaNAUAvCmPI4wWxXo4wUPRkJXt9d3mWPrrtL24e ksEQ== X-Gm-Message-State: AJIora9LHZNq1GcI5AO/jM6/1KuPtBBYBMZXmI3E57YJDZ8hhx/955+O enoXo8szapKjHkzkguyT9iXOsgxolQ== X-Received: from elver.muc.corp.google.com ([2a00:79e0:9c:201:6edf:e1bc:9a92:4ad0]) (user=elver job=sendgmr) by 2002:a17:907:97c9:b0:726:b4f8:f675 with SMTP id js9-20020a17090797c900b00726b4f8f675mr29337299ejc.427.1656947172380; Mon, 04 Jul 2022 08:06:12 -0700 (PDT) Date: Mon, 4 Jul 2022 17:05:06 +0200 In-Reply-To: <20220704150514.48816-1-elver@google.com> Message-Id: <20220704150514.48816-7-elver@google.com> Mime-Version: 1.0 References: <20220704150514.48816-1-elver@google.com> X-Mailer: git-send-email 2.37.0.rc0.161.g10f37bed90-goog Subject: [PATCH v3 06/14] perf/hw_breakpoint: Optimize constant number of breakpoint slots From: Marco Elver To: elver@google.com, Peter Zijlstra , Frederic Weisbecker , Ingo Molnar Cc: Thomas Gleixner , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Dmitry Vyukov , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-perf-users@vger.kernel.org, x86@kernel.org, linux-sh@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Optimize internal hw_breakpoint state if the architecture's number of breakpoint slots is constant. This avoids several kmalloc() calls and potentially unnecessary failures if the allocations fail, as well as subtly improves code generation and cache locality. The protocol is that if an architecture defines hw_breakpoint_slots via the preprocessor, it must be constant and the same for all types. Signed-off-by: Marco Elver Acked-by: Dmitry Vyukov --- arch/sh/include/asm/hw_breakpoint.h | 5 +- arch/x86/include/asm/hw_breakpoint.h | 5 +- kernel/events/hw_breakpoint.c | 94 ++++++++++++++++++---------- 3 files changed, 63 insertions(+), 41 deletions(-) diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h index 199d17b765f2..361a0f57bdeb 100644 --- a/arch/sh/include/asm/hw_breakpoint.h +++ b/arch/sh/include/asm/hw_breakpoint.h @@ -48,10 +48,7 @@ struct pmu; /* Maximum number of UBC channels */ #define HBP_NUM 2 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) /* arch/sh/kernel/hw_breakpoint.c */ extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw); diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h index a1f0e90d0818..0bc931cd0698 100644 --- a/arch/x86/include/asm/hw_breakpoint.h +++ b/arch/x86/include/asm/hw_breakpoint.h @@ -44,10 +44,7 @@ struct arch_hw_breakpoint { /* Total number of available HW breakpoint registers */ #define HBP_NUM 4 -static inline int hw_breakpoint_slots(int type) -{ - return HBP_NUM; -} +#define hw_breakpoint_slots(type) (HBP_NUM) struct perf_event_attr; struct perf_event; diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index 7df46b276452..9fb66d358d81 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -40,13 +40,16 @@ struct bp_cpuinfo { /* Number of pinned cpu breakpoints in a cpu */ unsigned int cpu_pinned; /* tsk_pinned[n] is the number of tasks having n+1 breakpoints */ +#ifdef hw_breakpoint_slots + unsigned int tsk_pinned[hw_breakpoint_slots(0)]; +#else unsigned int *tsk_pinned; +#endif /* Number of non-pinned cpu/task breakpoints in a cpu */ unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ }; static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); -static int nr_slots[TYPE_MAX] __ro_after_init; static struct bp_cpuinfo *get_bp_info(int cpu, enum bp_type_idx type) { @@ -73,6 +76,54 @@ struct bp_busy_slots { /* Serialize accesses to the above constraints */ static DEFINE_MUTEX(nr_bp_mutex); +#ifdef hw_breakpoint_slots +/* + * Number of breakpoint slots is constant, and the same for all types. + */ +static_assert(hw_breakpoint_slots(TYPE_INST) == hw_breakpoint_slots(TYPE_DATA)); +static inline int hw_breakpoint_slots_cached(int type) { return hw_breakpoint_slots(type); } +static inline int init_breakpoint_slots(void) { return 0; } +#else +/* + * Dynamic number of breakpoint slots. + */ +static int __nr_bp_slots[TYPE_MAX] __ro_after_init; + +static inline int hw_breakpoint_slots_cached(int type) +{ + return __nr_bp_slots[type]; +} + +static __init int init_breakpoint_slots(void) +{ + int i, cpu, err_cpu; + + for (i = 0; i < TYPE_MAX; i++) + __nr_bp_slots[i] = hw_breakpoint_slots(i); + + for_each_possible_cpu(cpu) { + for (i = 0; i < TYPE_MAX; i++) { + struct bp_cpuinfo *info = get_bp_info(cpu, i); + + info->tsk_pinned = kcalloc(__nr_bp_slots[i], sizeof(int), GFP_KERNEL); + if (!info->tsk_pinned) + goto err; + } + } + + return 0; +err: + for_each_possible_cpu(err_cpu) { + for (i = 0; i < TYPE_MAX; i++) + kfree(get_bp_info(err_cpu, i)->tsk_pinned); + if (err_cpu == cpu) + break; + } + + return -ENOMEM; +} +#endif + __weak int hw_breakpoint_weight(struct perf_event *bp) { return 1; @@ -95,7 +146,7 @@ static unsigned int max_task_bp_pinned(int cpu, enum bp_type_idx type) unsigned int *tsk_pinned = get_bp_info(cpu, type)->tsk_pinned; int i; - for (i = nr_slots[type] - 1; i >= 0; i--) { + for (i = hw_breakpoint_slots_cached(type) - 1; i >= 0; i--) { if (tsk_pinned[i] > 0) return i + 1; } @@ -312,7 +363,7 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) fetch_this_slot(&slots, weight); /* Flexible counters need to keep at least one slot */ - if (slots.pinned + (!!slots.flexible) > nr_slots[type]) + if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) return -ENOSPC; ret = arch_reserve_bp_slot(bp); @@ -632,7 +683,7 @@ bool hw_breakpoint_is_used(void) if (info->cpu_pinned) return true; - for (int slot = 0; slot < nr_slots[type]; ++slot) { + for (int slot = 0; slot < hw_breakpoint_slots_cached(type); ++slot) { if (info->tsk_pinned[slot]) return true; } @@ -716,42 +767,19 @@ static struct pmu perf_breakpoint = { int __init init_hw_breakpoint(void) { - int cpu, err_cpu; - int i, ret; - - for (i = 0; i < TYPE_MAX; i++) - nr_slots[i] = hw_breakpoint_slots(i); - - for_each_possible_cpu(cpu) { - for (i = 0; i < TYPE_MAX; i++) { - struct bp_cpuinfo *info = get_bp_info(cpu, i); - - info->tsk_pinned = kcalloc(nr_slots[i], sizeof(int), - GFP_KERNEL); - if (!info->tsk_pinned) { - ret = -ENOMEM; - goto err; - } - } - } + int ret; ret = rhltable_init(&task_bps_ht, &task_bps_ht_params); if (ret) - goto err; + return ret; + + ret = init_breakpoint_slots(); + if (ret) + return ret; constraints_initialized = true; perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT); return register_die_notifier(&hw_breakpoint_exceptions_nb); - -err: - for_each_possible_cpu(err_cpu) { - for (i = 0; i < TYPE_MAX; i++) - kfree(get_bp_info(err_cpu, i)->tsk_pinned); - if (err_cpu == cpu) - break; - } - - return ret; } -- 2.37.0.rc0.161.g10f37bed90-goog