Received: by 2002:a5d:925a:0:0:0:0:0 with SMTP id e26csp381865iol; Thu, 9 Jun 2022 05:53:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxEVncHMrAs+YdTrTYikovxdViiQSfK+UTdYJ9L0OVGicp/l6gYAvh5bi1CBVGAQJG+7v8T X-Received: by 2002:a17:902:b184:b0:168:a310:3eb0 with SMTP id s4-20020a170902b18400b00168a3103eb0mr5202486plr.9.1654779194131; Thu, 09 Jun 2022 05:53:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654779194; cv=none; d=google.com; s=arc-20160816; b=NbKZqhtvH36/OYd52Jzp3oxhc3Gv1Mb1fhe5JTkTj5DsTbLccznFsgvq7pDWJJaugs 1Xh4P7kiDv82RQvsQtYRFAtq2X/RLjhfT4Ch1C0MXVrkk0AhK7WT7CyUgcLAweFhbmUQ KrqMYSWAq5mz5YB12wV62V+aqe3uy7hvTQ6KGhq9BLLfosA/3WfMpC/7bLdexv/Oy79D O4GcTjKeFf2jpQ0WRh0XStl4VPXudkU5PcAdkK5D/uJT05EcNIlad+tP2mgX8KOq5TZ3 fsQ3GWw8Izq4CnyeEwAqodxklzJ5oljVgH/fgVnjLJAJUI/S1iVqu5D1IYXQdgtv/va1 iotQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=WY4HD3Pbrvr9rNcNT3QthTekyAgPVjpj/oQPb25t2us=; b=EXIZGcHguDv7iDCpiYnunH54z4kJT9JrV1sWiUaVTjwLABHDE2RDTY5ymKCJRIatln 41J9IEUlssnPKS3Frs7rXJf8eakYudU25C1j31iYBd8ZZ8krAn8/6Ufxd70yLUOrR6Q0 6sIurbDHFyFxJvLrWVXitKxkAKZLkpgmvyEV2qi+D7lwfj71yZsJ6nltklagvKpFVPRn qs4f2xfEq7qaDX8SmOdR1L/FMrZwqmgEn4ggqK9wuqZwHlaKCdNu7W7ysuGBGSD9Xdqo KaCoi1Wpkt4D3hzfrJ7Qx1gL6IoGszD0Vc3dCoJAwPwlLJTUQo77K6J1WTmj7SF8nrHN OFOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=F2mY2Kyj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z6-20020a17090abd8600b001e84ad1f823si24543972pjr.107.2022.06.09.05.52.59; Thu, 09 Jun 2022 05:53:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=F2mY2Kyj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243826AbiFIL4O (ORCPT + 99 others); Thu, 9 Jun 2022 07:56:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243004AbiFIL4N (ORCPT ); Thu, 9 Jun 2022 07:56:13 -0400 Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51228BE7 for ; Thu, 9 Jun 2022 04:56:11 -0700 (PDT) Received: by mail-lf1-x12a.google.com with SMTP id c2so18180384lfk.0 for ; Thu, 09 Jun 2022 04:56:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WY4HD3Pbrvr9rNcNT3QthTekyAgPVjpj/oQPb25t2us=; b=F2mY2KyjStSCcJDrxpUa4qJ1im4Hog16PjQPUDWpVZi3XYmPt8z6vVKqJ4+2TkvvPH ei1x+gOwSxEeKIj5EW+D3FTE0rHdDFHVcGrqLmERzMrTdfEt1opS+bBiv0VyePPmO+96 jmOgw1WzXFpE3uuuI9S4xrD5Y9Z7h0blixCxNf2DxnLvBdorDllHsSaQ0UVJO8GnMGr8 R3eiS7HHaOpZQm9fxYeYQA6KTzMQjIquR71Upfw+8rAlvPetG9IN8I9G6FVXsmWE2vEY qWKkb0fgXMJsz8fvLCzHSKcUn7p6Eb/9aIrjf6CIB7gOjo/kN81R81VO7i8Uou4MHlKv xMsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WY4HD3Pbrvr9rNcNT3QthTekyAgPVjpj/oQPb25t2us=; b=BGwX9anjukvCFAzambgdGWsoXWZcx3JDMhmdnAH5XQb93xHgaW8Lz9Esbk17wvZyFK HWkmAHTRiZybciK4IycFwIlUodbjex1YJxAoxhnTgr0+T1psoEpMwRhJOvzKRUt35LxA iA8pqAAnQ1DFSCJGg0S8fTkZ35/TG9U54OZr/eptPo3VJft5osBj/S2MrsT6nxCm0g2o ID8aO9WbLGxh8S12frdpRp3in2cQ5cgqCaq1JmO8XXV5O1c3illBdY2QCvVTEdgIxdNn mH3MbN2wcYftx4kY0KqNrzXY7lgMLhYd3Hz2MMEHoXCduKu1oxDD03SlmZ/5mBf/MsGu +rqQ== X-Gm-Message-State: AOAM530PfB5UHVjTRBst972vtL5olZJsT0G8lJUjwDd2/A39K/3CmWa0 3Tq2b3NRnZBJb2cRJd7mO8jHHmH/5EOlkmryf+LjIw== X-Received: by 2002:a05:6512:3d0:b0:478:9aca:4a06 with SMTP id w16-20020a05651203d000b004789aca4a06mr24033399lfp.410.1654775769284; Thu, 09 Jun 2022 04:56:09 -0700 (PDT) MIME-Version: 1.0 References: <20220609113046.780504-1-elver@google.com> <20220609113046.780504-4-elver@google.com> In-Reply-To: <20220609113046.780504-4-elver@google.com> From: Dmitry Vyukov Date: Thu, 9 Jun 2022 13:55:57 +0200 Message-ID: Subject: Re: [PATCH 3/8] perf/hw_breakpoint: Optimize constant number of breakpoint slots To: Marco Elver Cc: Peter Zijlstra , Frederic Weisbecker , Ingo Molnar , Thomas Gleixner , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , linux-perf-users@vger.kernel.org, x86@kernel.org, linux-sh@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 9 Jun 2022 at 13:31, Marco Elver wrote: > > Optimize internal hw_breakpoint state if the architecture's number of > breakpoint slots is constant. This avoids several kmalloc() calls and > potentially unnecessary failures if the allocations fail, as well as > subtly improves code generation and cache locality. > > The protocol is that if an architecture defines hw_breakpoint_slots via > the preprocessor, it must be constant and the same for all types. > > Signed-off-by: Marco Elver Acked-by: Dmitry Vyukov > --- > arch/sh/include/asm/hw_breakpoint.h | 5 +- > arch/x86/include/asm/hw_breakpoint.h | 5 +- > kernel/events/hw_breakpoint.c | 92 ++++++++++++++++++---------- > 3 files changed, 62 insertions(+), 40 deletions(-) > > diff --git a/arch/sh/include/asm/hw_breakpoint.h b/arch/sh/include/asm/hw_breakpoint.h > index 199d17b765f2..361a0f57bdeb 100644 > --- a/arch/sh/include/asm/hw_breakpoint.h > +++ b/arch/sh/include/asm/hw_breakpoint.h > @@ -48,10 +48,7 @@ struct pmu; > /* Maximum number of UBC channels */ > #define HBP_NUM 2 > > -static inline int hw_breakpoint_slots(int type) > -{ > - return HBP_NUM; > -} > +#define hw_breakpoint_slots(type) (HBP_NUM) > > /* arch/sh/kernel/hw_breakpoint.c */ > extern int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw); > diff --git a/arch/x86/include/asm/hw_breakpoint.h b/arch/x86/include/asm/hw_breakpoint.h > index a1f0e90d0818..0bc931cd0698 100644 > --- a/arch/x86/include/asm/hw_breakpoint.h > +++ b/arch/x86/include/asm/hw_breakpoint.h > @@ -44,10 +44,7 @@ struct arch_hw_breakpoint { > /* Total number of available HW breakpoint registers */ > #define HBP_NUM 4 > > -static inline int hw_breakpoint_slots(int type) > -{ > - return HBP_NUM; > -} > +#define hw_breakpoint_slots(type) (HBP_NUM) > > struct perf_event_attr; > struct perf_event; > diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c > index 1f718745d569..8e939723f27d 100644 > --- a/kernel/events/hw_breakpoint.c > +++ b/kernel/events/hw_breakpoint.c > @@ -41,13 +41,16 @@ struct bp_cpuinfo { > /* Number of pinned cpu breakpoints in a cpu */ > unsigned int cpu_pinned; > /* tsk_pinned[n] is the number of tasks having n+1 breakpoints */ > +#ifdef hw_breakpoint_slots > + unsigned int tsk_pinned[hw_breakpoint_slots(0)]; > +#else > unsigned int *tsk_pinned; > +#endif > /* Number of non-pinned cpu/task breakpoints in a cpu */ > unsigned int flexible; /* XXX: placeholder, see fetch_this_slot() */ > }; > > static DEFINE_PER_CPU(struct bp_cpuinfo, bp_cpuinfo[TYPE_MAX]); > -static int nr_slots[TYPE_MAX] __ro_after_init; > > static struct bp_cpuinfo *get_bp_info(int cpu, enum bp_type_idx type) > { > @@ -74,6 +77,54 @@ struct bp_busy_slots { > /* Serialize accesses to the above constraints */ > static DEFINE_MUTEX(nr_bp_mutex); > > +#ifdef hw_breakpoint_slots > +/* > + * Number of breakpoint slots is constant, and the same for all types. > + */ > +static_assert(hw_breakpoint_slots(TYPE_INST) == hw_breakpoint_slots(TYPE_DATA)); > +static inline int hw_breakpoint_slots_cached(int type) { return hw_breakpoint_slots(type); } > +static inline int init_breakpoint_slots(void) { return 0; } > +#else > +/* > + * Dynamic number of breakpoint slots. > + */ > +static int __nr_bp_slots[TYPE_MAX] __ro_after_init; > + > +static inline int hw_breakpoint_slots_cached(int type) > +{ > + return __nr_bp_slots[type]; > +} > + > +static __init int init_breakpoint_slots(void) > +{ > + int i, cpu, err_cpu; > + > + for (i = 0; i < TYPE_MAX; i++) > + __nr_bp_slots[i] = hw_breakpoint_slots(i); > + > + for_each_possible_cpu(cpu) { > + for (i = 0; i < TYPE_MAX; i++) { > + struct bp_cpuinfo *info = get_bp_info(cpu, i); > + > + info->tsk_pinned = kcalloc(__nr_bp_slots[i], sizeof(int), GFP_KERNEL); > + if (!info->tsk_pinned) > + goto err; > + } > + } > + > + return 0; > +err: > + for_each_possible_cpu(err_cpu) { > + for (i = 0; i < TYPE_MAX; i++) > + kfree(get_bp_info(err_cpu, i)->tsk_pinned); > + if (err_cpu == cpu) > + break; > + } > + > + return -ENOMEM; > +} > +#endif > + > __weak int hw_breakpoint_weight(struct perf_event *bp) > { > return 1; > @@ -96,7 +147,7 @@ static unsigned int max_task_bp_pinned(int cpu, enum bp_type_idx type) > unsigned int *tsk_pinned = get_bp_info(cpu, type)->tsk_pinned; > int i; > > - for (i = nr_slots[type] - 1; i >= 0; i--) { > + for (i = hw_breakpoint_slots_cached(type) - 1; i >= 0; i--) { > if (tsk_pinned[i] > 0) > return i + 1; > } > @@ -313,7 +364,7 @@ static int __reserve_bp_slot(struct perf_event *bp, u64 bp_type) > fetch_this_slot(&slots, weight); > > /* Flexible counters need to keep at least one slot */ > - if (slots.pinned + (!!slots.flexible) > nr_slots[type]) > + if (slots.pinned + (!!slots.flexible) > hw_breakpoint_slots_cached(type)) > return -ENOSPC; > > ret = arch_reserve_bp_slot(bp); > @@ -688,42 +739,19 @@ static struct pmu perf_breakpoint = { > > int __init init_hw_breakpoint(void) > { > - int cpu, err_cpu; > - int i, ret; > - > - for (i = 0; i < TYPE_MAX; i++) > - nr_slots[i] = hw_breakpoint_slots(i); > - > - for_each_possible_cpu(cpu) { > - for (i = 0; i < TYPE_MAX; i++) { > - struct bp_cpuinfo *info = get_bp_info(cpu, i); > - > - info->tsk_pinned = kcalloc(nr_slots[i], sizeof(int), > - GFP_KERNEL); > - if (!info->tsk_pinned) { > - ret = -ENOMEM; > - goto err; > - } > - } > - } > + int ret; > > ret = rhltable_init(&task_bps_ht, &task_bps_ht_params); > if (ret) > - goto err; > + return ret; > + > + ret = init_breakpoint_slots(); > + if (ret) > + return ret; > > constraints_initialized = true; > > perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT); > > return register_die_notifier(&hw_breakpoint_exceptions_nb); > - > -err: > - for_each_possible_cpu(err_cpu) { > - for (i = 0; i < TYPE_MAX; i++) > - kfree(get_bp_info(err_cpu, i)->tsk_pinned); > - if (err_cpu == cpu) > - break; > - } > - > - return ret; > } > -- > 2.36.1.255.ge46751e96f-goog >