2023-07-05 20:02:18

by Anup Sharma

[permalink] [raw]
Subject: [PATCH v2 0/7] Add support for Firefox's gecko profile format

This patch series adds support for Firefox's gecko profile format.
The format is documented here [1].

I have addressed few comments from the previous version of the patch
like using perf script python interface to process the samples. Also
fixed trailing whitespace and other minor issues.

[1] https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md

Anup Sharma (7):
scripts: python: Extact necessary information from process event
scripts: python: Introduce thread sample processing to create thread
scripts: python: create threads with schemas
scripts: python: implement get or create stack function
scripts: python: implement get or create frame function
scripts: python: implement add sample function and return finish
scripts: python: Add trace end processing and JSON output

.../scripts/python/firefox-gecko-converter.py | 204 ++++++++++++++++++
1 file changed, 204 insertions(+)
create mode 100644 tools/perf/scripts/python/firefox-gecko-converter.py

--
2.34.1



2023-07-05 20:12:27

by Anup Sharma

[permalink] [raw]
Subject: [PATCH v2 7/7] scripts: python: Add trace end processing and JSON output

Inside the trace end function the final output will be dumped
to standard output in JSON gecko format. Additionally, constants
such as USER_CATEGORY_INDEX, KERNEL_CATEGORY_INDEX, CATEGORIES, and
PRODUCT are defined to provide contextual information.
Also added _addThreadSample call which was missing.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 40 +++++++++++++++++++
1 file changed, 40 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 910e598c743f..6a2a4d816799 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -18,9 +18,47 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
from perf_trace_context import *
from Core import *

+USER_CATEGORY_INDEX = 0
+KERNEL_CATEGORY_INDEX = 1
thread_map = {}
start_time = None

+CATEGORIES = [
+ {'name': 'User', 'color': 'yellow', 'subcategories': ['Other']},
+ {'name': 'Kernel', 'color': 'orange', 'subcategories': ['Other']}
+]
+
+PRODUCT = os.popen('uname -op').read().strip()
+
+def trace_end():
+ thread_array = list(map(lambda thread: thread['finish'](), thread_map.values()))
+ for thread in thread_array:
+ key = thread['samples']['schema']['time']
+ thread['samples']['data'].sort(key=lambda data : float(data[key]))
+
+ result = {
+ 'meta': {
+ 'interval': 1,
+ 'processType': 0,
+ 'product': PRODUCT,
+ 'stackwalk': 1,
+ 'debug': 0,
+ 'gcpoison': 0,
+ 'asyncstack': 1,
+ 'startTime': start_time,
+ 'shutdownTime': None,
+ 'version': 24,
+ 'presymbolicated': True,
+ 'categories': CATEGORIES,
+ 'markerSchema': []
+ },
+ 'libs': [],
+ 'threads': thread_array,
+ 'processes': [],
+ 'pausedRanges': []
+ }
+ json.dump(result, sys.stdout, indent=2)
+
def process_event(param_dict):
global start_time
global thread_map
@@ -159,6 +197,8 @@ def process_event(param_dict):
stack.append(call['sym']['name'] + f' (in {call["dso"]})')
if len(stack) != 0:
stack = stack[::-1]
+ _addThreadSample(pid, tid, thread_name, time_stamp, stack)
else:
mod = param_dict['symbol'] if 'symbol' in param_dict else '[unknown]'
dso = param_dict['dso'] if 'dso' in param_dict else '[unknown]'
+ _addThreadSample(pid, tid, thread_name, time_stamp, [mod + f' (in {dso})'])
--
2.34.1


2023-07-05 20:13:41

by Anup Sharma

[permalink] [raw]
Subject: [PATCH v2 5/7] scripts: python: implement get or create frame function

The get_or_create_frame function is responsible for retrieving or
creating a frame based on the provided frameString. If the frame
corresponding to the frameString is found in the frameMap, it is
returned. Otherwise, a new frame is created by appending relevant
information to the frameTable's 'data' array and adding the
frameString to the stringTable.

The index of the newly created frame is added to the frameMap.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 33 +++++++++++++++++++
1 file changed, 33 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 6f69c083d3ff..d5b9fb16e520 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -77,6 +77,39 @@ def process_event(param_dict):
stackMap[key] = stack
return stack

+ frameMap = dict()
+ def get_or_create_frame(frameString):
+ frame = frameMap.get(frameString)
+ if frame is None:
+ frame = len(frameTable['data'])
+ location = len(stringTable)
+ stringTable.append(frameString)
+ category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 \
+ or frameString.find('/vmlinux') != -1 \
+ or frameString.endswith('.ko)') \
+ else USER_CATEGORY_INDEX
+ implementation = None
+ optimizations = None
+ line = None
+ relevantForJS = False
+ subcategory = None
+ innerWindowID = 0
+ column = None
+
+ frameTable['data'].append([
+ location,
+ relevantForJS,
+ innerWindowID,
+ implementation,
+ optimizations,
+ line,
+ column,
+ category,
+ subcategory,
+ ])
+ frameMap[frameString] = frame
+ return frame
+
def _addThreadSample(pid, tid, threadName, time_stamp, stack):
thread = thread_map.get(tid)
if not thread:
--
2.34.1


2023-07-06 06:24:27

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] scripts: python: implement get or create frame function

On Wed, Jul 5, 2023 at 12:48 PM Anup Sharma <[email protected]> wrote:
>
> The get_or_create_frame function is responsible for retrieving or
> creating a frame based on the provided frameString. If the frame
> corresponding to the frameString is found in the frameMap, it is
> returned. Otherwise, a new frame is created by appending relevant
> information to the frameTable's 'data' array and adding the
> frameString to the stringTable.
>
> The index of the newly created frame is added to the frameMap.
>
> Signed-off-by: Anup Sharma <[email protected]>
> ---
> .../scripts/python/firefox-gecko-converter.py | 33 +++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> index 6f69c083d3ff..d5b9fb16e520 100644
> --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> @@ -77,6 +77,39 @@ def process_event(param_dict):
> stackMap[key] = stack
> return stack
>
> + frameMap = dict()
> + def get_or_create_frame(frameString):
> + frame = frameMap.get(frameString)
> + if frame is None:
> + frame = len(frameTable['data'])
> + location = len(stringTable)
> + stringTable.append(frameString)

Looks like it just always appending a new string.
Any deduplication work later?

> + category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 \
> + or frameString.find('/vmlinux') != -1 \
> + or frameString.endswith('.ko)') \
> + else USER_CATEGORY_INDEX

I think you can use param_dict['sample']['cpumode'].
Please see include/uapi/linux/perf_event.h for cpumode
values.

> + implementation = None
> + optimizations = None
> + line = None
> + relevantForJS = False
> + subcategory = None
> + innerWindowID = 0
> + column = None
> +
> + frameTable['data'].append([
> + location,
> + relevantForJS,
> + innerWindowID,
> + implementation,
> + optimizations,
> + line,
> + column,
> + category,
> + subcategory,
> + ])
> + frameMap[frameString] = frame

I think it'd be better if you define the frameTable in this
commit.

Thanks,
Namhyung


> + return frame
> +
> def _addThreadSample(pid, tid, threadName, time_stamp, stack):
> thread = thread_map.get(tid)
> if not thread:
> --
> 2.34.1
>

2023-07-10 23:08:26

by Anup Sharma

[permalink] [raw]
Subject: Re: [PATCH v2 5/7] scripts: python: implement get or create frame function

On Wed, Jul 05, 2023 at 11:06:58PM -0700, Namhyung Kim wrote:
> On Wed, Jul 5, 2023 at 12:48 PM Anup Sharma <[email protected]> wrote:
> >
> > The get_or_create_frame function is responsible for retrieving or
> > creating a frame based on the provided frameString. If the frame
> > corresponding to the frameString is found in the frameMap, it is
> > returned. Otherwise, a new frame is created by appending relevant
> > information to the frameTable's 'data' array and adding the
> > frameString to the stringTable.
> >
> > The index of the newly created frame is added to the frameMap.
> >
> > Signed-off-by: Anup Sharma <[email protected]>
> > ---
> > .../scripts/python/firefox-gecko-converter.py | 33 +++++++++++++++++++
> > 1 file changed, 33 insertions(+)
> >
> > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > index 6f69c083d3ff..d5b9fb16e520 100644
> > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > @@ -77,6 +77,39 @@ def process_event(param_dict):
> > stackMap[key] = stack
> > return stack
> >
> > + frameMap = dict()
> > + def get_or_create_frame(frameString):
> > + frame = frameMap.get(frameString)
> > + if frame is None:
> > + frame = len(frameTable['data'])
> > + location = len(stringTable)
> > + stringTable.append(frameString)
>
> Looks like it just always appending a new string.
> Any deduplication work later?

Although this initially came to my mind and almost all stack frames
seem to be similar for a repeated call, but I am not sure if we can dedup
because some frames do differ as the call stack progresses. For example,
a process exits a function and then calls another function, resulting in
most of the stack frames being the same but the last one being different.

> > + category = KERNEL_CATEGORY_INDEX if frameString.find('kallsyms') != -1 \
> > + or frameString.find('/vmlinux') != -1 \
> > + or frameString.endswith('.ko)') \
> > + else USER_CATEGORY_INDEX
>
> I think you can use param_dict['sample']['cpumode'].
> Please see include/uapi/linux/perf_event.h for cpumode
> values.

I am actively working on incorporating the use of param_dict
['sample']['cpumode'] to determine the category in the
upcoming v4 update. I saw in param_dict['sample']['cpumode']
values exist as 1 and 2. Can you point me to exact line in
include/uapi/linux/perf_event.h . I am not able to find it.

> > + implementation = None
> > + optimizations = None
> > + line = None
> > + relevantForJS = False
> > + subcategory = None
> > + innerWindowID = 0
> > + column = None
> > +
> > + frameTable['data'].append([
> > + location,
> > + relevantForJS,
> > + innerWindowID,
> > + implementation,
> > + optimizations,
> > + line,
> > + column,
> > + category,
> > + subcategory,
> > + ])
> > + frameMap[frameString] = frame
>
> I think it'd be better if you define the frameTable in this
> commit.

Certainly, my apologies for the disorder.

> Thanks,
> Namhyung
>
>
> > + return frame
> > +
> > def _addThreadSample(pid, tid, threadName, time_stamp, stack):
> > thread = thread_map.get(tid)
> > if not thread:
> > --
> > 2.34.1
> >