LinuxLists.cc - [PATCH 0/9] Add support for Firefox's gecko profile format

2023-06-21 20:05:50

Subject: [PATCH 0/9] Add support for Firefox's gecko profile format

This patch series adds support for Firefox's gecko profile format.
The format is documented here [1]

The series adds a new python script that can be used to convert the
perf script to gecko profile format. To use this script, use the
following commands:

perf record
perf script -F +pid > perf_data.txt
python3 firefox-gecko-converter.py > gecko_profile.json

Also dont forget to change the chown of the output file to the user[2].

[1] https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1823421

Anup Sharma (9):
scripts: python: Add check for correct perf script format
scripts: python: implement add sample function and return it
scripts: python: Introduce thread sample processing in
convertPerfScriptProfile
scripts: python: Implement parsing of input data in
convertPerfScriptProfile
scripts: python: implement function for thread creation
scripts: python: implement get or create stack function
scripts: python: implement get or create frame function
scripts: python: Finalize convertPerfScriptProfile and return profile
data
scripts: python: Add temporary main function for testing purposes

.../scripts/python/firefox-gecko-converter.py | 249 ++++++++++++++++++
1 file changed, 249 insertions(+)
create mode 100644 tools/perf/scripts/python/firefox-gecko-converter.py

--
2.34.1

2023-06-21 20:06:43

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 4/9] scripts: python: Implement parsing of input data in convertPerfScriptProfile

The lines variable is created by splitting the profile string into individual
lines. It allows for iterating over each line for processing.

The line is considered the start of a sample. It is matched against a regular
expression pattern to extract relevant information such as before_time_stamp,
time_stamp, threadNamePidAndTidMatch, threadName, pid, and tid.

The stack frames of the current sample are then parsed in a nested loop.
Each stackFrameLine is matched against a regular expression pattern to
extract rawFunc and mod information.

Also fixed few checkpatch warnings.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 62 ++++++++++++++++++-
1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 0ff70c0349c8..e5bc7a11c3e6 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -1,4 +1,5 @@
#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
import re
import sys
import json
@@ -14,13 +15,13 @@ def isPerfScriptFormat(profile):
firstLine = profile[:profile.index('\n')]
return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))

-def convertPerfScriptProfile(profile):
+def convertPerfScriptProfile(profile):

def addSample(threadName, stackArray, time):
nonlocal name
if name != threadName:
name = threadName
- # TODO:
+ # TODO:
# get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does.
# get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does.
stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None)
@@ -54,3 +55,60 @@ def convertPerfScriptProfile(profile):
thread = _createtread(threadName, pid, tid)
threadMap[tid] = thread
thread['addSample'](threadName, stack, time_stamp)
+
+ lines = profile.split('\n')
+
+ line_index = 0
+ startTime = 0
+ while line_index < len(lines):
+ line = lines[line_index]
+ line_index += 1
+ # perf script --header outputs header lines beginning with #
+ if line == '' or line.startswith('#'):
+ continue
+
+ sample_start_line = line
+
+ sample_start_match = re.match(r'^(.*)\s+([\d.]+):', sample_start_line)
+ if not sample_start_match:
+ print(f'Could not parse line as the start of a sample in the "perf script" profile format: "{sample_start_line}"')
+ continue
+
+ before_time_stamp = sample_start_match[1]
+ time_stamp = float(sample_start_match[2]) * 1000
+ threadNamePidAndTidMatch = re.match(r'^(.*)\s+(?:(\d+)\/)?(\d+)\b', before_time_stamp)
+
+ if not threadNamePidAndTidMatch:
+ print('Could not parse line as the start of a sample in the "perf script" profile format: "%s"' % sampleStartLine)
+ continue
+ threadName = threadNamePidAndTidMatch[1].strip()
+ pid = int(threadNamePidAndTidMatch[2] or 0)
+ tid = int(threadNamePidAndTidMatch[3] or 0)
+ if startTime == 0:
+ startTime = time_stamp
+ # Parse the stack frames of the current sample in a nested loop.
+ stack = []
+ while line_index < len(lines):
+ stackFrameLine = lines[line_index]
+ line_index += 1
+ if stackFrameLine.strip() == '':
+ # Sample ends.
+ break
+ stackFrameMatch = re.match(r'^\s*(\w+)\s*(.+) $([^)]*)$', stackFrameLine)
+ if stackFrameMatch:
+ rawFunc = stackFrameMatch[2]
+ mod = stackFrameMatch[3]
+ rawFunc = re.sub(r'\+0x[\da-f]+$', '', rawFunc)
+
+ if rawFunc.startswith('('):
+ continue # skip process names
+
+ if mod:
+ # If we have a module name, provide it.
+ # The code processing the profile will search for
+ # "functionName (in libraryName)" using a regexp,
+ # and automatically create the library information.
+ rawFunc += f' (in {mod})'
+
+ stack.append(rawFunc)
+
--
2.34.1

2023-06-21 20:06:50

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 6/9] scripts: python: implement get or create stack function

The get_or_create_stack function is responsible for retrieving
or creating a stack based on the provided frame and prefix.
It first generates a key using the frame and prefix values.
If the stack corresponding to the key is found in the stackMap,
it is returned. Otherwise, a new stack is created by appending
the prefix and frame to the stackTable's 'data' array. The key
and the index of the newly created stack are added to the
stackMap for future reference.

Signed-off-by: Anup Sharma <[email protected]>
---
tools/perf/scripts/python/firefox-gecko-converter.py | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index cdd7f901c13f..30fc542cfdeb 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -57,6 +57,18 @@ def convertPerfScriptProfile(profile):
},
'data': [],
}
+
+ stringTable = []
+
+ stackMap = dict()
+ def get_or_create_stack(frame, prefix):
+ key = f"{frame}" if prefix is None else f"{frame},{prefix}"
+ stack = stackMap.get(key)
+ if stack is None:
+ stack = len(stackTable['data'])
+ stackTable['data'].append([prefix, frame])
+ stackMap[key] = stack
+ return stack

def addSample(threadName, stackArray, time):
nonlocal name
--
2.34.1

2023-06-21 20:08:18

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 9/9] scripts: python: Add temporary main function for testing purposes

This commit introduces a temporary main function for ease of testing.
Please note that this function is not intended to be a permanent
part of the codebase.

The output is serialized as JSON and printed to the standard output
(stdout) with an indentation of 2 for better readability.

Signed-off-by: Anup Sharma <[email protected]>
---
tools/perf/scripts/python/firefox-gecko-converter.py | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 385a8b77a70a..0f133d9acee9 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -236,3 +236,14 @@ def convertPerfScriptProfile(profile):
'pausedRanges': []
}

+def main():
+# inputFile = input('Enter input file name: ')
+ with open('perf_data.txt') as f:
+ profile = f.read()
+ isPerfScript = isPerfScriptFormat(profile)
+ output = convertPerfScriptProfile(profile)
+ json.dump(output, sys.stdout, indent=2)
+# print('isPerfScript: {}'.format(isPerfScript))
+
+if __name__ == '__main__':
+ main()
--
2.34.1

2023-06-21 20:13:19

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 7/9] scripts: python: implement get or create frame function

The CATEGORIES list and the USER_CATEGORY_INDEX and
KERNEL_CATEGORY_INDEX constants has been introduced.

The get_or_create_frame function is responsible for retrieving or
creating a frame based on the provided frameString. If the frame
corresponding to the frameString is found in the frameMap, it is
returned. Otherwise, a new frame is created by appending relevant
information to the frameTable's 'data' array and adding the
frameString to the stringTable.

The index of the newly created frame is added to the frameMap.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 38 +++++++++++++++++++
1 file changed, 38 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 30fc542cfdeb..866751e5d1ce 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -15,6 +15,13 @@ def isPerfScriptFormat(profile):
firstLine = profile[:profile.index('\n')]
return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))

+CATEGORIES = [
+{'name': 'User', 'color': 'yellow', 'subcategories': ['Other']},
+{'name': 'Kernel', 'color': 'orange', 'subcategories': ['Other']}
+]
+USER_CATEGORY_INDEX = 0
+KERNEL_CATEGORY_INDEX = 1
+
def convertPerfScriptProfile(profile):
def _createtread(name, pid, tid):
markers = {
@@ -70,6 +77,37 @@ def convertPerfScriptProfile(profile):
stackMap[key] = stack
return stack

+ frameMap = dict()
+ def get_or_create_frame(frameString):
+ frame = frameMap.get(frameString)
+ if frame is None:
+ frame = len(frameTable['data'])
+ location = len(stringTable)
+ stringTable.append(frameString)
+
+ category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 or frameString.find('/vmlinux') != -1 or frameString.endswith('.ko)') else USER_CATEGORY_INDEX
+ implementation = None
+ optimizations = None
+ line = None
+ relevantForJS = False
+ subcategory = None
+ innerWindowID = 0
+ column = None
+
+ frameTable['data'].append([
+ location,
+ relevantForJS,
+ innerWindowID,
+ implementation,
+ optimizations,
+ line,
+ column,
+ category,
+ subcategory,
+ ])
+ frameMap[frameString] = frame
+ return frame
+
def addSample(threadName, stackArray, time):
nonlocal name
if name != threadName:
--
2.34.1

2023-06-21 20:14:30

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 1/9] scripts: python: Add check for correct perf script format

The isPerfScriptFormat function, validates the format of a perf script.
The function checks if the given input meets specific criteria to
determine if it is a valid perf script output.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 15 +++++++++++++++
1 file changed, 15 insertions(+)
create mode 100644 tools/perf/scripts/python/firefox-gecko-converter.py

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
new file mode 100644
index 000000000000..73a431d0c7d1
--- /dev/null
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -0,0 +1,15 @@
+#!/usr/bin/env python3
+import re
+import sys
+import json
+from functools import reduce
+
+def isPerfScriptFormat(profile):
+ if profile.startswith('# ========\n'):
+ return True
+
+ if profile.startswith('{'):
+ return False
+
+ firstLine = profile[:profile.index('\n')]
+ return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
--
2.34.1

2023-06-21 20:14:31

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 2/9] scripts: python: implement add sample function and return finish function

The main convertPerfScriptProfile function returns a dictionary with
references to the addSample and finish functions, allowing external
code to utilize them for profile conversion. This function has few
more functions which will be added.

The addSample function appends a new entry to the 'samples' data structure.
It takes the thread name, stack array, and time as input parameters.
If the thread name differs from the current name, it updates the name.
The function utilizes the get_or_create_stack and get_or_create_frame
methods to construct the stack structure. Finally, it adds the stack,
time, and responsiveness values to the 'data' list within 'samples'.

The finish function generates a dictionary containing various profile information
such as 'tid', 'pid', 'name', 'markers', 'samples', 'frameTable', 'stackTable',
'stringTable', 'registerTime', 'unregisterTime', and 'processType'.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 73a431d0c7d1..2817d4a96269 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -13,3 +13,36 @@ def isPerfScriptFormat(profile):

firstLine = profile[:profile.index('\n')]
return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
+
+def convertPerfScriptProfile(profile):
+
+ def addSample(threadName, stackArray, time):
+ nonlocal name
+ if name != threadName:
+ name = threadName
+ # TODO:
+ # get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does.
+ # get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does.
+ stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None)
+ responsiveness = 0
+ samples['data'].append([stack, time, responsiveness])
+
+ def finish():
+ return {
+ "tid": tid,
+ "pid": pid,
+ "name": name,
+ "markers": markers,
+ "samples": samples,
+ "frameTable": frameTable,
+ "stackTable": stackTable,
+ "stringTable": stringTable,
+ "registerTime": 0,
+ "unregisterTime": None,
+ "processType": 'default'
+ }
+
+ return {
+ "addSample": addSample,
+ "finish": finish
+ }
--
2.34.1

2023-06-21 20:15:56

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 5/9] scripts: python: implement function for thread creation

Create a thread with the specified name, process ID (pid),
and thread ID (tid). The function initializes markers,
samples, frameTable, and stackTable structures for the thread.

The markers structure defines the schema and data for thread
markers, including fields such as 'name', 'startTime',
'endTime', 'phase', 'category', and 'data'.

The samples structure defines the schema and data for thread
samples, including fields such as 'stack', 'time', and 'responsiveness'.

The frameTable structure defines the schema and data for frame
information, including fields such as 'location', 'relevantForJS',
'innerWindowID', 'implementation', 'optimizations', 'line',
'column', 'category', and 'subcategory'.

The purpose of this function is to create a new thread structure
with the necessary data schemas and initial data arrays. These
structures provide a framework for storing and organizing information
related to thread markers, samples, frame details, and stack information.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 41 +++++++++++++++++++
1 file changed, 41 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index e5bc7a11c3e6..cdd7f901c13f 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -16,6 +16,47 @@ def isPerfScriptFormat(profile):
return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))

def convertPerfScriptProfile(profile):
+ def _createtread(name, pid, tid):
+ markers = {
+ 'schema': {
+ 'name': 0,
+ 'startTime': 1,
+ 'endTime': 2,
+ 'phase': 3,
+ 'category': 4,
+ 'data': 5,
+ },
+ 'data': [],
+ }
+ samples = {
+ 'schema': {
+ 'stack': 0,
+ 'time': 1,
+ 'responsiveness': 2,
+ },
+ 'data': [],
+ }
+ frameTable = {
+ 'schema': {
+ 'location': 0,
+ 'relevantForJS': 1,
+ 'innerWindowID': 2,
+ 'implementation': 3,
+ 'optimizations': 4,
+ 'line': 5,
+ 'column': 6,
+ 'category': 7,
+ 'subcategory': 8,
+ },
+ 'data': [],
+ }
+ stackTable = {
+ 'schema': {
+ 'prefix': 0,
+ 'frame': 1,
+ },
+ 'data': [],
+ }

def addSample(threadName, stackArray, time):
nonlocal name
--
2.34.1

2023-06-21 20:17:03

by Anup Sharma

[permalink] [raw]

Subject: [PATCH 8/9] scripts: python: Finalize convertPerfScriptProfile and return profile data

If the stack is not empty, it is reversed and then passed to
the _addThreadSample function along with other relevant information.
This adds the final sample to the thread.

The thread_array is generated by mapping the 'finish' method on each
thread in the threadMap and collecting the results.
The samples within each thread are sorted in ascending order based on
the 'time' field to ensure they are in the correct order.
This implementation finalizes the processing of the profile data in
convertPerfScriptProfile and returns a structured profile representation

I still need to get the product from the device, having little confusion
on this so will implement it on next version.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 37 ++++++++++++++++++-
1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 866751e5d1ce..385a8b77a70a 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -64,7 +64,7 @@ def convertPerfScriptProfile(profile):
},
'data': [],
}
-
+
stringTable = []

stackMap = dict()
@@ -84,7 +84,7 @@ def convertPerfScriptProfile(profile):
frame = len(frameTable['data'])
location = len(stringTable)
stringTable.append(frameString)
-
+
category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 or frameString.find('/vmlinux') != -1 or frameString.endswith('.ko)') else USER_CATEGORY_INDEX
implementation = None
optimizations = None
@@ -203,3 +203,36 @@ def convertPerfScriptProfile(profile):

stack.append(rawFunc)

+ if len(stack) != 0:
+ stack.reverse()
+ _addThreadSample(pid, tid, threadName, time_stamp, stack)
+
+ thread_array = list(map(lambda thread: thread['finish'](), threadMap.values()))
+
+ for thread in thread_array:
+ # The samples are not guaranteed to be in order, sort them so that they are.
+ key = thread['samples']['schema']['time']
+ thread['samples']['data'].sort(key=lambda data : float(data[key]))
+
+ return {
+ 'meta': {
+ 'interval': 1,
+ 'processType': 0,
+ 'product': 'Linux perf', # TODO: get this from the system
+ 'stackwalk': 1,
+ 'debug': 0,
+ 'gcpoison': 0,
+ 'asyncstack': 1,
+ 'startTime': startTime,
+ 'shutdownTime': None,
+ 'version': 24,
+ 'presymbolicated': True,
+ 'categories': CATEGORIES,
+ 'markerSchema': []
+ },
+ 'libs': [],
+ 'threads': thread_array,
+ 'processes': [],
+ 'pausedRanges': []
+ }
+
--
2.34.1

2023-06-22 05:11:16

by Adrian Hunter

[permalink] [raw]

Subject: Re: [PATCH 0/9] Add support for Firefox's gecko profile format

On 21/06/23 22:35, Anup Sharma wrote:
> This patch series adds support for Firefox's gecko profile format.
> The format is documented here [1]
>
> The series adds a new python script that can be used to convert the
> perf script to gecko profile format. To use this script, use the
> following commands:
>
> perf record
> perf script -F +pid > perf_data.txt
> python3 firefox-gecko-converter.py > gecko_profile.json

Why not use the perf script python interface?

https://perf.wiki.kernel.org/index.php/Latest_Manual_Page_of_perf-script-python.1

>
> Also dont forget to change the chown of the output file to the user[2].
>
> [1] https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md
> [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1823421
>
> Anup Sharma (9):
> scripts: python: Add check for correct perf script format
> scripts: python: implement add sample function and return it
> scripts: python: Introduce thread sample processing in
> convertPerfScriptProfile
> scripts: python: Implement parsing of input data in
> convertPerfScriptProfile
> scripts: python: implement function for thread creation
> scripts: python: implement get or create stack function
> scripts: python: implement get or create frame function
> scripts: python: Finalize convertPerfScriptProfile and return profile
> data
> scripts: python: Add temporary main function for testing purposes
>
> .../scripts/python/firefox-gecko-converter.py | 249 ++++++++++++++++++
> 1 file changed, 249 insertions(+)
> create mode 100644 tools/perf/scripts/python/firefox-gecko-converter.py
>

2023-06-22 20:18:17

by Anup Sharma

[permalink] [raw]

Subject: Re: [PATCH 0/9] Add support for Firefox's gecko profile format

On Thu, Jun 22, 2023 at 07:43:13AM +0300, Adrian Hunter wrote:
> On 21/06/23 22:35, Anup Sharma wrote:
> > This patch series adds support for Firefox's gecko profile format.
> > The format is documented here [1]
> >
> > The series adds a new python script that can be used to convert the
> > perf script to gecko profile format. To use this script, use the
> > following commands:
> >
> > perf record
> > perf script -F +pid > perf_data.txt
> > python3 firefox-gecko-converter.py > gecko_profile.json
>
> Why not use the perf script python interface?

Hi Adrian,
Thank you for your suggestion. I am working on it and will try to implement
it in the next version.

> https://perf.wiki.kernel.org/index.php/Latest_Manual_Page_of_perf-script-python.1
>
> >
> > Also dont forget to change the chown of the output file to the user[2].
> >
> > [1] https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md
> > [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1823421
> >
> > Anup Sharma (9):
> > scripts: python: Add check for correct perf script format
> > scripts: python: implement add sample function and return it
> > scripts: python: Introduce thread sample processing in
> > convertPerfScriptProfile
> > scripts: python: Implement parsing of input data in
> > convertPerfScriptProfile
> > scripts: python: implement function for thread creation
> > scripts: python: implement get or create stack function
> > scripts: python: implement get or create frame function
> > scripts: python: Finalize convertPerfScriptProfile and return profile
> > data
> > scripts: python: Add temporary main function for testing purposes
> >
> > .../scripts/python/firefox-gecko-converter.py | 249 ++++++++++++++++++
> > 1 file changed, 249 insertions(+)
> > create mode 100644 tools/perf/scripts/python/firefox-gecko-converter.py
> >
>

2023-06-24 00:07:30

by Namhyung Kim

[permalink] [raw]

Subject: Re: [PATCH 4/9] scripts: python: Implement parsing of input data in convertPerfScriptProfile

Hi Anup,

On Wed, Jun 21, 2023 at 12:41 PM Anup Sharma <[email protected]> wrote:
>
> The lines variable is created by splitting the profile string into individual
> lines. It allows for iterating over each line for processing.
>
> The line is considered the start of a sample. It is matched against a regular
> expression pattern to extract relevant information such as before_time_stamp,
> time_stamp, threadNamePidAndTidMatch, threadName, pid, and tid.
>
> The stack frames of the current sample are then parsed in a nested loop.
> Each stackFrameLine is matched against a regular expression pattern to
> extract rawFunc and mod information.
>
> Also fixed few checkpatch warnings.
>
> Signed-off-by: Anup Sharma <[email protected]>
> ---
> .../scripts/python/firefox-gecko-converter.py | 62 ++++++++++++++++++-
> 1 file changed, 60 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> index 0ff70c0349c8..e5bc7a11c3e6 100644
> --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> @@ -1,4 +1,5 @@
> #!/usr/bin/env python3
> +# SPDX-License-Identifier: GPL-2.0

Please put this line in the first commit.

> import re
> import sys
> import json
> @@ -14,13 +15,13 @@ def isPerfScriptFormat(profile):
> firstLine = profile[:profile.index('\n')]
> return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
>
> -def convertPerfScriptProfile(profile):
> +def convertPerfScriptProfile(profile):

You'd better configure your editor to warn or even fix
the trailing whitespace automatically.

Thanks,
Namhyung

>
> def addSample(threadName, stackArray, time):
> nonlocal name
> if name != threadName:
> name = threadName
> - # TODO:
> + # TODO:
> # get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does.
> # get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does.
> stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None)
> @@ -54,3 +55,60 @@ def convertPerfScriptProfile(profile):
> thread = _createtread(threadName, pid, tid)
> threadMap[tid] = thread
> thread['addSample'](threadName, stack, time_stamp)
> +
> + lines = profile.split('\n')
> +
> + line_index = 0
> + startTime = 0
> + while line_index < len(lines):
> + line = lines[line_index]
> + line_index += 1
> + # perf script --header outputs header lines beginning with #
> + if line == '' or line.startswith('#'):
> + continue
> +
> + sample_start_line = line
> +
> + sample_start_match = re.match(r'^(.*)\s+([\d.]+):', sample_start_line)
> + if not sample_start_match:
> + print(f'Could not parse line as the start of a sample in the "perf script" profile format: "{sample_start_line}"')
> + continue
> +
> + before_time_stamp = sample_start_match[1]
> + time_stamp = float(sample_start_match[2]) * 1000
> + threadNamePidAndTidMatch = re.match(r'^(.*)\s+(?:(\d+)\/)?(\d+)\b', before_time_stamp)
> +
> + if not threadNamePidAndTidMatch:
> + print('Could not parse line as the start of a sample in the "perf script" profile format: "%s"' % sampleStartLine)
> + continue
> + threadName = threadNamePidAndTidMatch[1].strip()
> + pid = int(threadNamePidAndTidMatch[2] or 0)
> + tid = int(threadNamePidAndTidMatch[3] or 0)
> + if startTime == 0:
> + startTime = time_stamp
> + # Parse the stack frames of the current sample in a nested loop.
> + stack = []
> + while line_index < len(lines):
> + stackFrameLine = lines[line_index]
> + line_index += 1
> + if stackFrameLine.strip() == '':
> + # Sample ends.
> + break
> + stackFrameMatch = re.match(r'^\s*(\w+)\s*(.+) $([^)]*)$', stackFrameLine)
> + if stackFrameMatch:
> + rawFunc = stackFrameMatch[2]
> + mod = stackFrameMatch[3]
> + rawFunc = re.sub(r'\+0x[\da-f]+$', '', rawFunc)
> +
> + if rawFunc.startswith('('):
> + continue # skip process names
> +
> + if mod:
> + # If we have a module name, provide it.
> + # The code processing the profile will search for
> + # "functionName (in libraryName)" using a regexp,
> + # and automatically create the library information.
> + rawFunc += f' (in {mod})'
> +
> + stack.append(rawFunc)
> +
> --
> 2.34.1
>

2023-06-24 00:41:43

by Namhyung Kim

[permalink] [raw]

Subject: Re: [PATCH 7/9] scripts: python: implement get or create frame function

On Wed, Jun 21, 2023 at 12:45 PM Anup Sharma <[email protected]> wrote:
>
> The CATEGORIES list and the USER_CATEGORY_INDEX and
> KERNEL_CATEGORY_INDEX constants has been introduced.
>
> The get_or_create_frame function is responsible for retrieving or
> creating a frame based on the provided frameString. If the frame
> corresponding to the frameString is found in the frameMap, it is
> returned. Otherwise, a new frame is created by appending relevant
> information to the frameTable's 'data' array and adding the
> frameString to the stringTable.
>
> The index of the newly created frame is added to the frameMap.
>
> Signed-off-by: Anup Sharma <[email protected]>
> ---
> .../scripts/python/firefox-gecko-converter.py | 38 +++++++++++++++++++
> 1 file changed, 38 insertions(+)
>
> diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> index 30fc542cfdeb..866751e5d1ce 100644
> --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> @@ -15,6 +15,13 @@ def isPerfScriptFormat(profile):
> firstLine = profile[:profile.index('\n')]
> return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
>
> +CATEGORIES = [
> +{'name': 'User', 'color': 'yellow', 'subcategories': ['Other']},
> +{'name': 'Kernel', 'color': 'orange', 'subcategories': ['Other']}
> +]
> +USER_CATEGORY_INDEX = 0
> +KERNEL_CATEGORY_INDEX = 1
> +
> def convertPerfScriptProfile(profile):
> def _createtread(name, pid, tid):
> markers = {
> @@ -70,6 +77,37 @@ def convertPerfScriptProfile(profile):
> stackMap[key] = stack
> return stack
>
> + frameMap = dict()
> + def get_or_create_frame(frameString):
> + frame = frameMap.get(frameString)
> + if frame is None:
> + frame = len(frameTable['data'])
> + location = len(stringTable)
> + stringTable.append(frameString)
> +
> + category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 or frameString.find('/vmlinux') != -1 or frameString.endswith('.ko)') else USER_CATEGORY_INDEX

This line is too long, we usually don't allow long lines
over 100 characters.

Thanks,
Namhyung

> + implementation = None
> + optimizations = None
> + line = None
> + relevantForJS = False
> + subcategory = None
> + innerWindowID = 0
> + column = None
> +
> + frameTable['data'].append([
> + location,
> + relevantForJS,
> + innerWindowID,
> + implementation,
> + optimizations,
> + line,
> + column,
> + category,
> + subcategory,
> + ])
> + frameMap[frameString] = frame
> + return frame
> +
> def addSample(threadName, stackArray, time):
> nonlocal name
> if name != threadName:
> --
> 2.34.1
>

2023-07-05 20:23:41

by Anup Sharma

[permalink] [raw]

Subject: Re: [PATCH 7/9] scripts: python: implement get or create frame function

On Fri, Jun 23, 2023 at 05:04:56PM -0700, Namhyung Kim wrote:
> On Wed, Jun 21, 2023 at 12:45 PM Anup Sharma <[email protected]> wrote:
> >
> > The CATEGORIES list and the USER_CATEGORY_INDEX and
> > KERNEL_CATEGORY_INDEX constants has been introduced.
> >
> > The get_or_create_frame function is responsible for retrieving or
> > creating a frame based on the provided frameString. If the frame
> > corresponding to the frameString is found in the frameMap, it is
> > returned. Otherwise, a new frame is created by appending relevant
> > information to the frameTable's 'data' array and adding the
> > frameString to the stringTable.
> >
> > The index of the newly created frame is added to the frameMap.
> >
> > Signed-off-by: Anup Sharma <[email protected]>
> > ---
> > .../scripts/python/firefox-gecko-converter.py | 38 +++++++++++++++++++
> > 1 file changed, 38 insertions(+)
> >
> > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > index 30fc542cfdeb..866751e5d1ce 100644
> > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > @@ -15,6 +15,13 @@ def isPerfScriptFormat(profile):
> > firstLine = profile[:profile.index('\n')]
> > return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
> >
> > +CATEGORIES = [
> > +{'name': 'User', 'color': 'yellow', 'subcategories': ['Other']},
> > +{'name': 'Kernel', 'color': 'orange', 'subcategories': ['Other']}
> > +]
> > +USER_CATEGORY_INDEX = 0
> > +KERNEL_CATEGORY_INDEX = 1
> > +
> > def convertPerfScriptProfile(profile):
> > def _createtread(name, pid, tid):
> > markers = {
> > @@ -70,6 +77,37 @@ def convertPerfScriptProfile(profile):
> > stackMap[key] = stack
> > return stack
> >
> > + frameMap = dict()
> > + def get_or_create_frame(frameString):
> > + frame = frameMap.get(frameString)
> > + if frame is None:
> > + frame = len(frameTable['data'])
> > + location = len(stringTable)
> > + stringTable.append(frameString)
> > +
> > + category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 or frameString.find('/vmlinux') != -1 or frameString.endswith('.ko)') else USER_CATEGORY_INDEX
>
> This line is too long, we usually don't allow long lines
> over 100 characters.

Thanks for your suggestion. I have taken care in latest version.
Is there any way to add such checks in editor itself ? I used checkpatch.pl
scripts, however it didnt catch this.

> Thanks,
> Namhyung
>
>
> > + implementation = None
> > + optimizations = None
> > + line = None
> > + relevantForJS = False
> > + subcategory = None
> > + innerWindowID = 0
> > + column = None
> > +
> > + frameTable['data'].append([
> > + location,
> > + relevantForJS,
> > + innerWindowID,
> > + implementation,
> > + optimizations,
> > + line,
> > + column,
> > + category,
> > + subcategory,
> > + ])
> > + frameMap[frameString] = frame
> > + return frame
> > +
> > def addSample(threadName, stackArray, time):
> > nonlocal name
> > if name != threadName:
> > --
> > 2.34.1
> >

2023-07-05 21:21:17

by Anup Sharma

[permalink] [raw]

Subject: Re: [PATCH 4/9] scripts: python: Implement parsing of input data in convertPerfScriptProfile

On Fri, Jun 23, 2023 at 05:03:12PM -0700, Namhyung Kim wrote:
> Hi Anup,
>
> On Wed, Jun 21, 2023 at 12:41 PM Anup Sharma <[email protected]> wrote:
> >
> > The lines variable is created by splitting the profile string into individual
> > lines. It allows for iterating over each line for processing.
> >
> > The line is considered the start of a sample. It is matched against a regular
> > expression pattern to extract relevant information such as before_time_stamp,
> > time_stamp, threadNamePidAndTidMatch, threadName, pid, and tid.
> >
> > The stack frames of the current sample are then parsed in a nested loop.
> > Each stackFrameLine is matched against a regular expression pattern to
> > extract rawFunc and mod information.
> >
> > Also fixed few checkpatch warnings.
> >
> > Signed-off-by: Anup Sharma <[email protected]>
> > ---
> > .../scripts/python/firefox-gecko-converter.py | 62 ++++++++++++++++++-
> > 1 file changed, 60 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > index 0ff70c0349c8..e5bc7a11c3e6 100644
> > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > @@ -1,4 +1,5 @@
> > #!/usr/bin/env python3
> > +# SPDX-License-Identifier: GPL-2.0
>
> Please put this line in the first commit.

Sure, followed in latest version.

> > import re
> > import sys
> > import json
> > @@ -14,13 +15,13 @@ def isPerfScriptFormat(profile):
> > firstLine = profile[:profile.index('\n')]
> > return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
> >
> > -def convertPerfScriptProfile(profile):
> > +def convertPerfScriptProfile(profile):
>
> You'd better configure your editor to warn or even fix
> the trailing whitespace automatically.

Thanks, I followed your advice and configured my nvim to handle trailing
whitespace automatically. It has significantly improved my workflow.
Here's the updated snippet I added to my vimrc file:

highlight ExtraWhitespace ctermbg=white guibg=white
match ExtraWhitespace /\s\+$/

> Thanks,
> Namhyung
>
>
> >
> > def addSample(threadName, stackArray, time):
> > nonlocal name
> > if name != threadName:
> > name = threadName
> > - # TODO:
> > + # TODO:
> > # get_or_create_stack will create a new stack if it doesn't exist, or return the existing stack if it does.
> > # get_or_create_frame will create a new frame if it doesn't exist, or return the existing frame if it does.
> > stack = reduce(lambda prefix, stackFrame: get_or_create_stack(get_or_create_frame(stackFrame), prefix), stackArray, None)
> > @@ -54,3 +55,60 @@ def convertPerfScriptProfile(profile):
> > thread = _createtread(threadName, pid, tid)
> > threadMap[tid] = thread
> > thread['addSample'](threadName, stack, time_stamp)
> > +
> > + lines = profile.split('\n')
> > +
> > + line_index = 0
> > + startTime = 0
> > + while line_index < len(lines):
> > + line = lines[line_index]
> > + line_index += 1
> > + # perf script --header outputs header lines beginning with #
> > + if line == '' or line.startswith('#'):
> > + continue
> > +
> > + sample_start_line = line
> > +
> > + sample_start_match = re.match(r'^(.*)\s+([\d.]+):', sample_start_line)
> > + if not sample_start_match:
> > + print(f'Could not parse line as the start of a sample in the "perf script" profile format: "{sample_start_line}"')
> > + continue
> > +
> > + before_time_stamp = sample_start_match[1]
> > + time_stamp = float(sample_start_match[2]) * 1000
> > + threadNamePidAndTidMatch = re.match(r'^(.*)\s+(?:(\d+)\/)?(\d+)\b', before_time_stamp)
> > +
> > + if not threadNamePidAndTidMatch:
> > + print('Could not parse line as the start of a sample in the "perf script" profile format: "%s"' % sampleStartLine)
> > + continue
> > + threadName = threadNamePidAndTidMatch[1].strip()
> > + pid = int(threadNamePidAndTidMatch[2] or 0)
> > + tid = int(threadNamePidAndTidMatch[3] or 0)
> > + if startTime == 0:
> > + startTime = time_stamp
> > + # Parse the stack frames of the current sample in a nested loop.
> > + stack = []
> > + while line_index < len(lines):
> > + stackFrameLine = lines[line_index]
> > + line_index += 1
> > + if stackFrameLine.strip() == '':
> > + # Sample ends.
> > + break
> > + stackFrameMatch = re.match(r'^\s*(\w+)\s*(.+) $([^)]*)$', stackFrameLine)
> > + if stackFrameMatch:
> > + rawFunc = stackFrameMatch[2]
> > + mod = stackFrameMatch[3]
> > + rawFunc = re.sub(r'\+0x[\da-f]+$', '', rawFunc)
> > +
> > + if rawFunc.startswith('('):
> > + continue # skip process names
> > +
> > + if mod:
> > + # If we have a module name, provide it.
> > + # The code processing the profile will search for
> > + # "functionName (in libraryName)" using a regexp,
> > + # and automatically create the library information.
> > + rawFunc += f' (in {mod})'
> > +
> > + stack.append(rawFunc)
> > +
> > --
> > 2.34.1
> >

2023-07-06 16:16:00

by Ian Rogers

[permalink] [raw]

Subject: Re: [PATCH 7/9] scripts: python: implement get or create frame function

On Wed, Jul 5, 2023 at 1:01 PM Anup Sharma <[email protected]> wrote:
>
> On Fri, Jun 23, 2023 at 05:04:56PM -0700, Namhyung Kim wrote:
> > On Wed, Jun 21, 2023 at 12:45 PM Anup Sharma <[email protected]> wrote:
> > >
> > > The CATEGORIES list and the USER_CATEGORY_INDEX and
> > > KERNEL_CATEGORY_INDEX constants has been introduced.
> > >
> > > The get_or_create_frame function is responsible for retrieving or
> > > creating a frame based on the provided frameString. If the frame
> > > corresponding to the frameString is found in the frameMap, it is
> > > returned. Otherwise, a new frame is created by appending relevant
> > > information to the frameTable's 'data' array and adding the
> > > frameString to the stringTable.
> > >
> > > The index of the newly created frame is added to the frameMap.
> > >
> > > Signed-off-by: Anup Sharma <[email protected]>
> > > ---
> > > .../scripts/python/firefox-gecko-converter.py | 38 +++++++++++++++++++
> > > 1 file changed, 38 insertions(+)
> > >
> > > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > > index 30fc542cfdeb..866751e5d1ce 100644
> > > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > > @@ -15,6 +15,13 @@ def isPerfScriptFormat(profile):
> > > firstLine = profile[:profile.index('\n')]
> > > return bool(re.match(r'^\S.*?\s+(?:\d+/)?\d+\s+(?:\d+\d+\s+)?[\d.]+:', firstLine))
> > >
> > > +CATEGORIES = [
> > > +{'name': 'User', 'color': 'yellow', 'subcategories': ['Other']},
> > > +{'name': 'Kernel', 'color': 'orange', 'subcategories': ['Other']}
> > > +]
> > > +USER_CATEGORY_INDEX = 0
> > > +KERNEL_CATEGORY_INDEX = 1
> > > +
> > > def convertPerfScriptProfile(profile):
> > > def _createtread(name, pid, tid):
> > > markers = {
> > > @@ -70,6 +77,37 @@ def convertPerfScriptProfile(profile):
> > > stackMap[key] = stack
> > > return stack
> > >
> > > + frameMap = dict()
> > > + def get_or_create_frame(frameString):
> > > + frame = frameMap.get(frameString)
> > > + if frame is None:
> > > + frame = len(frameTable['data'])
> > > + location = len(stringTable)
> > > + stringTable.append(frameString)
> > > +
> > > + category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 or frameString.find('/vmlinux') != -1 or frameString.endswith('.ko)') else USER_CATEGORY_INDEX
> >
> > This line is too long, we usually don't allow long lines
> > over 100 characters.
>
> Thanks for your suggestion. I have taken care in latest version.
> Is there any way to add such checks in editor itself ? I used checkpatch.pl
> scripts, however it didnt catch this.

Unfortunately checkpatch.pl doesn't work for python code yet. I think
using mypy types would be useful:
https://github.com/python/mypy
Also having docstring on functions would be useful. Some of the code
has some fairly complex indirection and it'd be nice to understand
why.

Thanks,
Ian

> > Thanks,
> > Namhyung
> >
> >
> > > + implementation = None
> > > + optimizations = None
> > > + line = None
> > > + relevantForJS = False
> > > + subcategory = None
> > > + innerWindowID = 0
> > > + column = None
> > > +
> > > + frameTable['data'].append([
> > > + location,
> > > + relevantForJS,
> > > + innerWindowID,
> > > + implementation,
> > > + optimizations,
> > > + line,
> > > + column,
> > > + category,
> > > + subcategory,
> > > + ])
> > > + frameMap[frameString] = frame
> > > + return frame
> > > +
> > > def addSample(threadName, stackArray, time):
> > > nonlocal name
> > > if name != threadName:
> > > --
> > > 2.34.1
> > >