Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp2538848rwd; Wed, 14 Jun 2023 04:28:04 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7EZ+/eeUll2Scna0vQVAv8EX9bg1V2U1g6aK+yCCjHuY397oytaNNUguvTVy0ZeZTVOlrE X-Received: by 2002:a05:6871:6baa:b0:19e:8deb:389b with SMTP id zh42-20020a0568716baa00b0019e8deb389bmr9576322oab.41.1686742084135; Wed, 14 Jun 2023 04:28:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686742084; cv=none; d=google.com; s=arc-20160816; b=QlNVVB4437tzGPx8Pxe96JZo3j4eQHH6lPRf+NILfTmH4J5Y503/HdjEE5xezkXz87 MtNS3d3T7fWmwDuH95HAE34Nq4wapmX+w085hfog0ANbGAUtj7pwlEsD4HNEMmvjUGR2 Dk4RcI5tT65ta/SrUSHWHdTJ4L3l11cvHsVuENj2RC0N7gWBnwLnFSeFiU7UYf4LNGgL McRzXhLutBNmySoqVspkCJe25vbyVTrqEKqAqATOeWy9WecKMQu07MeM/Rz9U/f2chYO M7GL4c9WVeNRVtp1CygErfofppLpspk7t6Xz7v/e0yBlUzPGdEURmy533G03qM0ELHy9 Sx6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Yivx9I00pEzNHV0mSBAe/yLB4Eg3LsKfyz1Q4hR9kbw=; b=H+c4kKo2l+5T4hEE+Pd2jJfv7+fqNwQVvJPwozRp5Q1LPVs5utbtOrDd9qW74kKW8s 8M4vDA4esU5o8nBPXpDfxmkkld5JkdbhjjWcmkOJmloE2A+IPBba3/tOd/KTPzECtZq0 LQ8CccFv46C3WAdfxiSqvQyrAJzfuTB/yJHw2pR4MxBUXjbJpV1sPoTqS6gQQTgUQr16 GVFQ2yF693MO3amNlMz3BPWuXpRwYIGfW0K8W6XcGX1d/gmXE6fYgmuKfzui+dlXZmEj l5suxf5rWP2ijKZa5rVtaaSaPjp9DdLydC3TKbtIKXNyAuIsWmdoW6w0fR2Z4LcizHPR bSTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 135-20020a63018d000000b0054fe4986366si9992pgb.342.2023.06.14.04.27.51; Wed, 14 Jun 2023 04:28:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236186AbjFNK7n (ORCPT + 99 others); Wed, 14 Jun 2023 06:59:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235923AbjFNK7l (ORCPT ); Wed, 14 Jun 2023 06:59:41 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 650F91BEF; Wed, 14 Jun 2023 03:59:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E46941FB; Wed, 14 Jun 2023 04:00:20 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.24.16]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 60BB43F71E; Wed, 14 Jun 2023 03:59:34 -0700 (PDT) Date: Wed, 14 Jun 2023 11:59:28 +0100 From: Mark Rutland To: Anshuman Khandual Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, catalin.marinas@arm.com, Mark Brown , James Clark , Rob Herring , Marc Zyngier , Suzuki Poulose , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , linux-perf-users@vger.kernel.org Subject: Re: [PATCH V11 08/10] arm64/perf: Add struct brbe_regset helper functions Message-ID: References: <20230531040428.501523-1-anshuman.khandual@arm.com> <20230531040428.501523-9-anshuman.khandual@arm.com> <2e91f218-e740-5b7d-fa8d-8fc43a6502a2@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e91f218-e740-5b7d-fa8d-8fc43a6502a2@arm.com> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 14, 2023 at 10:44:38AM +0530, Anshuman Khandual wrote: > On 6/13/23 22:47, Mark Rutland wrote: > >> +/* > >> + * This scans over BRBE register banks and captures individual branch reocrds > >> + * [BRBSRC, BRBTGT, BRBINF] into a pre-allocated 'struct brbe_regset' buffer, > >> + * until an invalid one gets encountered. The caller for this function needs > >> + * to ensure BRBE is an appropriate state before the records can be captured. > >> + */ > >> +static int capture_brbe_regset(struct brbe_hw_attr *brbe_attr, struct brbe_regset *buf) > >> +{ > >> + int loop1_idx1, loop1_idx2, loop2_idx1, loop2_idx2; > >> + int idx, count; > >> + > >> + loop1_idx1 = BRBE_BANK0_IDX_MIN; > >> + if (brbe_attr->brbe_nr <= BRBE_BANK_MAX_ENTRIES) { > >> + loop1_idx2 = brbe_attr->brbe_nr - 1; > >> + loop2_idx1 = BRBE_BANK1_IDX_MIN; > >> + loop2_idx2 = BRBE_BANK0_IDX_MAX; > >> + } else { > >> + loop1_idx2 = BRBE_BANK0_IDX_MAX; > >> + loop2_idx1 = BRBE_BANK1_IDX_MIN; > >> + loop2_idx2 = brbe_attr->brbe_nr - 1; > >> + } > >> + > >> + select_brbe_bank(BRBE_BANK_IDX_0); > >> + for (idx = 0, count = loop1_idx1; count <= loop1_idx2; idx++, count++) { > >> + buf[idx].brbinf = get_brbinf_reg(idx); > >> + /* > >> + * There are no valid entries anymore on the buffer. > >> + * Abort the branch record processing to save some > >> + * cycles and also reduce the capture/process load > >> + * for the user space as well. > >> + */ > >> + if (brbe_invalid(buf[idx].brbinf)) > >> + return idx; > >> + > >> + buf[idx].brbsrc = get_brbsrc_reg(idx); > >> + buf[idx].brbtgt = get_brbtgt_reg(idx); > >> + } > >> + > >> + select_brbe_bank(BRBE_BANK_IDX_1); > >> + for (count = loop2_idx1; count <= loop2_idx2; idx++, count++) { > >> + buf[idx].brbinf = get_brbinf_reg(idx); > >> + /* > >> + * There are no valid entries anymore on the buffer. > >> + * Abort the branch record processing to save some > >> + * cycles and also reduce the capture/process load > >> + * for the user space as well. > >> + */ > >> + if (brbe_invalid(buf[idx].brbinf)) > >> + return idx; > >> + > >> + buf[idx].brbsrc = get_brbsrc_reg(idx); > >> + buf[idx].brbtgt = get_brbtgt_reg(idx); > >> + } > >> + return idx; > >> +} > > > > As with __armv8pmu_branch_read(), the loop conditions are a bit hard to follow, > > and I believe that can be rewritten along the lines of the suggestion there. > > I have changed both the places (in separate patches) with suggested loop structure. > > > > > Looking at this, we now have a couple of places that will try to read the > > registers for an individual record, so it probably makes sense to facotr that > > into a helper, e.g. > > There are indeed two places inside capture_brbe_regset() - one for each bank. > > > > > | static bool __read_brbe_regset(struct brbe_regset *entry, int idx) > > | { > > | u64 brbinf = get_brbinf_reg(idx); > > | > > | if (brbe_invalid(brbinf)) > > | return false; > > | > > | entry->brbinf = brbinf; > > | entry->brbsrc = get_brbsrc_reg(idx); > > | entry->brbtgt = get_brbtgt_reg(idx); > > | > > | return true; > > | } > > > > ... which can be used here, e.g. > > > > | /* > > | * Capture all records before the first invalid record, and return the number > > | * of records captured. > > | */ > > | static int capture_brbe_regset(struct brbe_hw_attr *brbe_attr, struct brbe_regset *buf) > > | { > > | > > | int nr_entries = brbe_attr->brbe_nr; > > | int idx = 0; > > | > > | select_brbe_bank(BRBE_BANK_IDX_0); > > | while (idx < nr_entries && IDX < BRBE_BANK0_IDX_MAX) { > > | if (__read_brbe_regset(&buf[idx], idx)) > > It should test !_read_brbe_regset(&buf[idx], idx)) instead as the error > case returns false. Yes, my bad. > >> +static int stitch_stored_live_entries(struct brbe_regset *stored, > >> + struct brbe_regset *live, > >> + int nr_stored, int nr_live, > >> + int nr_max) > >> +{ > >> + int nr_total, nr_excess, nr_last, i; > >> + > >> + nr_total = nr_stored + nr_live; > >> + nr_excess = nr_total - nr_max; > >> + > >> + /* Stored branch records in stitched buffer */ > >> + if (nr_live == nr_max) > >> + nr_stored = 0; > >> + else if (nr_excess > 0) > >> + nr_stored -= nr_excess; > >> + > >> + /* Stitched buffer branch records length */ > >> + if (nr_total > nr_max) > >> + nr_last = nr_max; > >> + else > >> + nr_last = nr_total; > >> + > >> + /* Move stored branch records */ > >> + for (i = 0; i < nr_stored; i++) > >> + copy_brbe_regset(stored, i, stored, nr_last - nr_stored - 1 + i); > >> + > >> + /* Copy live branch records */ > >> + for (i = 0; i < nr_live; i++) > >> + copy_brbe_regset(live, i, stored, i); > >> + > >> + return nr_last; > >> +} > > > > I think this can be written more simply as something like: > > > > static int stitch_stored_live_entries(struct brbe_regset *stored, > > struct brbe_regset *live, > > int nr_stored, int nr_live, > > int nr_max) > > { > > int nr_move = max(nr_stored, nr_max - nr_live); > > Should this compare be min() instead ? Yup, my bad again. That should be min(). > > /* Move the tail of the buffer to make room for the new entries */ > > memmove(&stored[nr_live], &stored[0], nr_move * sizeof(*stored)); > > > > /* Copy the new entries into the head of the buffer */ > > memcpy(stored[0], &live[0], nr_live * sizeof(*stored)); > > > > /* Return the number of entries in the stitched buffer */ > > return min(nr_live + nr_stored, nr_max); > > } > > Otherwise this makes sense and simpler, will rework. Great! Thanks, Mark.