Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp29589648rwd; Wed, 5 Jul 2023 14:21:17 -0700 (PDT) X-Google-Smtp-Source: APBJJlEp1Wri+p28z+WjvSwP1+2eMWkbeKVdhdz3cBanYHpeRn2C70a36P8eGA5WGJeCQ/1Lfq5x X-Received: by 2002:a17:902:d312:b0:1b8:b46d:91b7 with SMTP id b18-20020a170902d31200b001b8b46d91b7mr110137plc.45.1688592077309; Wed, 05 Jul 2023 14:21:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688592077; cv=none; d=google.com; s=arc-20160816; b=fU/oH+fNFtx31rK6JQjUyYhVUNuN96/6ePF6bgvnbi3hVRkTbcBjY5L+BXiVVpPK2U AwH5uY+MBCHmRPiBVzW9W8ybch/tbb7KvOH4dpL6OhFW21KgatuhJpj5QcDTAJBYjbcI Hi3XYeaKtdAWnLCVgyQQlOrXO4EEWGrDG8QjhzFXDiDqCA+2dF6iCTG+HTL7ZLrKmpC/ 0QQt5FHE8ED4RpN9lQ/eGBY/BU1aWHoCotl2AsKhZdHkv5tUPbj7uVR+FuUnQv2RmVHn NSf4cLJ1yTer6radaUCK8JMAAjoIllCW5G/stReW+k77duMKT7mYmzy5PSyd+EYz6eY5 IJgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:date:from:dkim-signature; bh=TlXlQ+VQSstFNgJX2WkC6cI55mZqRHLWVinXakjJg1s=; fh=vck5kFZ6SdlNXAtY63R+nnnix1rYSWhBCpJaUQTr4iQ=; b=VYMlVWqgaGwJAdxk50pG0nGvdS5++R6yMrpcE6VT2Lb5x2hvNlnuQu850DEnh45Bxo rrCbSjjY5O1hmR5gmNLs8WiiLt2A0xy0MaPAUYFBgAKczF576+9f0i8fApSXInEEdP+z Z5dRcPa1qfV2/wP0F+eyP/eDLYVxHVvUN8noGmYi+1bog0s/J75OI/pgsSm59bcwuVx1 M7Utp88iiZjmo7xOBbqv94n3HNWYZTkqRmIOHK5GbAxIh6cJA605YziwLF3tU0dfqFOQ D5wdw1RQETHwIEn9RvvYvFzSmVpioaWtnevdCGf0B3IelRv0h/zv9ieUjD2YRI10IHSc fm8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=UpTCbeii; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o6-20020a170902d4c600b001b890b3ba71si8755112plg.420.2023.07.05.14.21.02; Wed, 05 Jul 2023 14:21:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=UpTCbeii; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233532AbjGET5E (ORCPT + 99 others); Wed, 5 Jul 2023 15:57:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233470AbjGET5D (ORCPT ); Wed, 5 Jul 2023 15:57:03 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4584173B; Wed, 5 Jul 2023 12:56:58 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-6686a05bc66so44417b3a.1; Wed, 05 Jul 2023 12:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688587018; x=1691179018; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from:from:to :cc:subject:date:message-id:reply-to; bh=TlXlQ+VQSstFNgJX2WkC6cI55mZqRHLWVinXakjJg1s=; b=UpTCbeiin1FfU7OOgARkTOwjqxlPLBscacN9VOt+W9yiLVV5vhmm1iFDMB1rHtD1kK tnCJFXedvuJSJODlK4vMw3++tGED61qYXYKDoKPLUbwPqM4BhF9Ip0teRKucNebRzwQ9 zxlUSjpb8uEVjKZ56XoVCQH908gang1JdDa0J95QJYOIMk6DQfyr2PtXgAmiHR57lJH2 hCmo66WwYNNv0YZRumJokqyDSzao9fHyzUrbZSJNEBYEIDt6dIllJOnfgXKaZ/4nhnxK yIyX6vLHFQDhsaYs6RZjOo8gTs5J2Iyu952ih0o0i4Ioi2eW5whMa20LIoAvzM4p2ghJ 0uww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688587018; x=1691179018; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TlXlQ+VQSstFNgJX2WkC6cI55mZqRHLWVinXakjJg1s=; b=NgYaZo6LiHscR1W+g6gtIJ6bLex7yF0jdSFEhNFNTPfmlS936qYN7VaI3Gf3amOM2P zjNKhktKCkt3WApJlJetGAvei9i0CdGZm550nZ/chui08dF144ReEjdWCz3BIIlm9H+A +hsK+x1XqxjVbNKec/Vu1qAujg8fO88vrbJg+Md3FtaY9/nFp21T9RV4BddG57N6/rLn FJXzhqchxkEs9TiSMY8IwliVmgclvnwe8YkwrD4D/Pu3ffE4BlTB0dMjjNMLtNWT3BFf Dq57Zbr1AEdIfpLjJRXqh1gGOwII9UPaL0zFpqvjdsE/k4K2z5UknGWt7w8SUx2QXyno N1Zg== X-Gm-Message-State: ABy/qLbAYXGQ+XRTh2wGC4onwUCSylEKzQhftMu+pEr7Y1M2qer3HjgI mYQSryehljU99vSUYqeP2HA= X-Received: by 2002:a05:6a00:1a8b:b0:66f:7076:a5b8 with SMTP id e11-20020a056a001a8b00b0066f7076a5b8mr16611045pfv.29.1688587018060; Wed, 05 Jul 2023 12:56:58 -0700 (PDT) Received: from yoga ([2400:1f00:13:c628:31be:68ae:86f5:48b9]) by smtp.gmail.com with ESMTPSA id 18-20020a056a00071200b0063b96574b8bsm4514794pfl.220.2023.07.05.12.56.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Jul 2023 12:56:57 -0700 (PDT) From: Anup Sharma X-Google-Original-From: Anup Sharma Date: Thu, 6 Jul 2023 01:26:50 +0530 To: Namhyung Kim Cc: Anup Sharma , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 4/9] scripts: python: Implement parsing of input data in convertPerfScriptProfile Message-ID: References: <3772bce9068962f2a4c57672e919ebdf30edbc5c.1687375189.git.anupnewsmail@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 23, 2023 at 05:03:12PM -0700, Namhyung Kim wrote: > Hi Anup, > > On Wed, Jun 21, 2023 at 12:41 PM Anup Sharma wrote: > > > > The lines variable is created by splitting the profile string into individual > > lines. It allows for iterating over each line for processing. > > > > The line is considered the start of a sample. It is matched against a regular > > expression pattern to extract relevant information such as before_time_stamp, > > time_stamp, threadNamePidAndTidMatch, threadName, pid, and tid. > > > > The stack frames of the current sample are then parsed in a nested loop. > > Each stackFrameLine is matched against a regular expression pattern to > > extract rawFunc and mod information. > > > > Also fixed few checkpatch warnings. > > > > Signed-off-by: Anup Sharma > > --- > > .../scripts/python/firefox-gecko-converter.py | 62 ++++++++++++++++++- > > 1 file changed, 60 insertions(+), 2 deletions(-) > > > > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py > > index 0ff70c0349c8..e5bc7a11c3e6 100644 > > --- a/tools/perf/scripts/python/firefox-gecko-converter.py > > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py > > @@ -1,4 +1,5 @@ > > #!/usr/bin/env python3 > > +# SPDX-License-Identifier: GPL-2.0 > > Please put this line in the first commit. Sure, followed in latest version. > > import re > > import sys > > import json > > @@ -14,13 +15,13 @@ def isPerfScriptFormat(profile): > > firstLine = profile[:profile.index('\n')] > > return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine)) > > > > -def convertPerfScriptProfile(profile): > > +def convertPerfScriptProfile(profile): > > You'd better configure your editor to warn or even fix > the trailing whitespace automatically. Thanks, I followed your advice and configured my nvim to handle trailing whitespace automatically. It has significantly improved my workflow. Here's the updated snippet I added to my vimrc file: highlight ExtraWhitespace ctermbg=white guibg=white match ExtraWhitespace /\s\+$/ > Thanks, > Namhyung > > > > > > def addSample(threadName, stackArray, time): > > nonlocal name > > if name != threadName: > > name = threadName > > - # TODO: > > + # TODO: > > # get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does. > > # get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does. > > stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None) > > @@ -54,3 +55,60 @@ def convertPerfScriptProfile(profile): > > thread = _createtread(threadName, pid, tid) > > threadMap[tid] = thread > > thread['addSample'](threadName, stack, time_stamp) > > + > > + lines = profile.split('\n') > > + > > + line_index = 0 > > + startTime = 0 > > + while line_index < len(lines): > > + line = lines[line_index] > > + line_index += 1 > > + # perf script --header outputs header lines beginning with # > > + if line == '' or line.startswith('#'): > > + continue > > + > > + sample_start_line = line > > + > > + sample_start_match = re.match(r'^(.*)\s+([\d.]+):', sample_start_line) > > + if not sample_start_match: > > + print(f'Could not parse line as the start of a sample in the "perf script" profile format: "{sample_start_line}"') > > + continue > > + > > + before_time_stamp = sample_start_match[1] > > + time_stamp = float(sample_start_match[2]) * 1000 > > + threadNamePidAndTidMatch = re.match(r'^(.*)\s+(?:(\d+)\/)?(\d+)\b', before_time_stamp) > > + > > + if not threadNamePidAndTidMatch: > > + print('Could not parse line as the start of a sample in the "perf script" profile format: "%s"' % sampleStartLine) > > + continue > > + threadName = threadNamePidAndTidMatch[1].strip() > > + pid = int(threadNamePidAndTidMatch[2] or 0) > > + tid = int(threadNamePidAndTidMatch[3] or 0) > > + if startTime == 0: > > + startTime = time_stamp > > + # Parse the stack frames of the current sample in a nested loop. > > + stack = [] > > + while line_index < len(lines): > > + stackFrameLine = lines[line_index] > > + line_index += 1 > > + if stackFrameLine.strip() == '': > > + # Sample ends. > > + break > > + stackFrameMatch = re.match(r'^\s*(\w+)\s*(.+) \(([^)]*)\)', stackFrameLine) > > + if stackFrameMatch: > > + rawFunc = stackFrameMatch[2] > > + mod = stackFrameMatch[3] > > + rawFunc = re.sub(r'\+0x[\da-f]+$', '', rawFunc) > > + > > + if rawFunc.startswith('('): > > + continue # skip process names > > + > > + if mod: > > + # If we have a module name, provide it. > > + # The code processing the profile will search for > > + # "functionName (in libraryName)" using a regexp, > > + # and automatically create the library information. > > + rawFunc += f' (in {mod})' > > + > > + stack.append(rawFunc) > > + > > -- > > 2.34.1 > >