2023-07-11 00:18:07

by Anup Sharma

[permalink] [raw]
Subject: [PATCH v3 3/6] scripts: python: thread sample processing to create thread with schemas

The _addThreadSample function is responsible for adding a sample
to a specific thread. It first checks if the thread exists in
the thread_map dictionary.

The markers structure defines the schema and data for
thread markers, including fields such as 'name',
'startTime', 'endTime', 'phase', 'category', and 'data'.

The samples structure defines the schema and data for thread
samples, including fields such as 'stack', 'time', and
'responsiveness'.

The frameTable structure defines the schema and data for frame
information, including fields such as 'location', 'relevantForJS',
'innerWindowID', 'implementation', 'optimizations', 'line',
'column', 'category', and 'subcategory'.

The purpose of this function is to create a new thread structure
These structures provide a framework for storing and organizing
information related to thread markers, samples, frame details,
and stack information.

The call stack is parsed to include function names and the associated
DSO, which are requires for creating json schema. Also few libaries
has been included which will be used in later commit.

Signed-off-by: Anup Sharma <[email protected]>
---
.../scripts/python/firefox-gecko-converter.py | 70 +++++++++++++++++++
1 file changed, 70 insertions(+)

diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
index 765f1775cee5..0b8a86bdcab1 100644
--- a/tools/perf/scripts/python/firefox-gecko-converter.py
+++ b/tools/perf/scripts/python/firefox-gecko-converter.py
@@ -21,6 +21,7 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
from perf_trace_context import *
from Core import *

+thread_map = {}
start_time = None

def trace_end():
@@ -28,6 +29,57 @@ def trace_end():

def process_event(param_dict):
global start_time
+ global thread_map
+
+ def _createThread(name, pid, tid):
+ markers = {
+ 'schema': {
+ 'name': 0,
+ 'startTime': 1,
+ 'endTime': 2,
+ 'phase': 3,
+ 'category': 4,
+ 'data': 5,
+ },
+ 'data': [],
+ }
+ samples = {
+ 'schema': {
+ 'stack': 0,
+ 'time': 1,
+ 'responsiveness': 2,
+ },
+ 'data': [],
+ }
+ frameTable = {
+ 'schema': {
+ 'location': 0,
+ 'relevantForJS': 1,
+ 'innerWindowID': 2,
+ 'implementation': 3,
+ 'optimizations': 4,
+ 'line': 5,
+ 'column': 6,
+ 'category': 7,
+ 'subcategory': 8,
+ },
+ 'data': [],
+ }
+ stackTable = {
+ 'schema': {
+ 'prefix': 0,
+ 'frame': 1,
+ },
+ 'data': [],
+ }
+ stringTable = []
+
+ def _addThreadSample(pid, tid, threadName, time_stamp, stack):
+ thread = thread_map.get(tid)
+ if not thread:
+ thread = _createThread(threadName, pid, tid)
+ thread_map[tid] = thread
+
# Extract relevant information from the event parameters. The event parameters
# are in a dictionary:
time_stamp = (param_dict['sample']['time'] // 1000) / 1000
@@ -37,3 +89,21 @@ def process_event(param_dict):

# Assume that start time is the time of the first event.
start_time = time_stamp if not start_time else start_time
+
+ # Parse the callchain of the current sample into a stack array.
+ if param_dict['callchain']:
+ stack = []
+ for call in param_dict['callchain']:
+ if 'sym' not in call:
+ continue
+ stack.append(call['sym']['name'] + f' (in {call["dso"]})')
+ if len(stack) != 0:
+ stack = stack[::-1]
+ _addThreadSample(pid, tid, thread_name, time_stamp, stack)
+
+ # During perf record if -g is not used, the callchain is not available.
+ # In that case, the symbol and dso are available in the event parameters.
+ else:
+ func = param_dict['symbol'] if 'symbol' in param_dict else '[unknown]'
+ dso = param_dict['dso'] if 'dso' in param_dict else '[unknown]'
+ _addThreadSample(pid, tid, thread_name, time_stamp, [func + f' (in {dso})'])
--
2.34.1



2023-07-12 18:03:57

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v3 3/6] scripts: python: thread sample processing to create thread with schemas

On Mon, Jul 10, 2023 at 4:13 PM Anup Sharma <[email protected]> wrote:
>
> The _addThreadSample function is responsible for adding a sample
> to a specific thread. It first checks if the thread exists in
> the thread_map dictionary.
>
> The markers structure defines the schema and data for
> thread markers, including fields such as 'name',
> 'startTime', 'endTime', 'phase', 'category', and 'data'.
>
> The samples structure defines the schema and data for thread
> samples, including fields such as 'stack', 'time', and
> 'responsiveness'.
>
> The frameTable structure defines the schema and data for frame
> information, including fields such as 'location', 'relevantForJS',
> 'innerWindowID', 'implementation', 'optimizations', 'line',
> 'column', 'category', and 'subcategory'.
>
> The purpose of this function is to create a new thread structure
> These structures provide a framework for storing and organizing
> information related to thread markers, samples, frame details,
> and stack information.
>
> The call stack is parsed to include function names and the associated
> DSO, which are requires for creating json schema. Also few libaries
> has been included which will be used in later commit.

nit: s/requires/required.
nit: I think the "Also few..." statement is out-of-date.

> Signed-off-by: Anup Sharma <[email protected]>
> ---
> .../scripts/python/firefox-gecko-converter.py | 70 +++++++++++++++++++
> 1 file changed, 70 insertions(+)
>
> diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> index 765f1775cee5..0b8a86bdcab1 100644
> --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> @@ -21,6 +21,7 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
> from perf_trace_context import *
> from Core import *
>

A comment and type information would be useful here. map is another
word for a dictionary, which is somewhat implied. So the information
here is that this data structure will hold something to do with
threads. Perhaps say, "a map from TID to a Thread." A better variable
name may then be tid_to_thread_map, but as map is implied you could
do: tid_to_thread: Dict[int, Thread].

> +thread_map = {}
> start_time = None
>
> def trace_end():
> @@ -28,6 +29,57 @@ def trace_end():
>
> def process_event(param_dict):
> global start_time
> + global thread_map
> +
> + def _createThread(name, pid, tid):
> + markers = {
> + 'schema': {
> + 'name': 0,
> + 'startTime': 1,
> + 'endTime': 2,
> + 'phase': 3,
> + 'category': 4,
> + 'data': 5,
> + },
> + 'data': [],
> + }
> + samples = {
> + 'schema': {
> + 'stack': 0,
> + 'time': 1,
> + 'responsiveness': 2,
> + },
> + 'data': [],
> + }
> + frameTable = {
> + 'schema': {
> + 'location': 0,
> + 'relevantForJS': 1,
> + 'innerWindowID': 2,
> + 'implementation': 3,
> + 'optimizations': 4,
> + 'line': 5,
> + 'column': 6,
> + 'category': 7,
> + 'subcategory': 8,
> + },
> + 'data': [],
> + }
> + stackTable = {
> + 'schema': {
> + 'prefix': 0,
> + 'frame': 1,
> + },
> + 'data': [],
> + }
> + stringTable = []

Is there a missing return here?

> +
> + def _addThreadSample(pid, tid, threadName, time_stamp, stack):
> + thread = thread_map.get(tid)
> + if not thread:
> + thread = _createThread(threadName, pid, tid)
> + thread_map[tid] = thread
> +
> # Extract relevant information from the event parameters. The event parameters
> # are in a dictionary:
> time_stamp = (param_dict['sample']['time'] // 1000) / 1000
> @@ -37,3 +89,21 @@ def process_event(param_dict):
>
> # Assume that start time is the time of the first event.
> start_time = time_stamp if not start_time else start_time
> +
> + # Parse the callchain of the current sample into a stack array.
> + if param_dict['callchain']:
> + stack = []
> + for call in param_dict['callchain']:
> + if 'sym' not in call:
> + continue
> + stack.append(call['sym']['name'] + f' (in {call["dso"]})')

Rather than mix an append and an f-string, just have the f-string ie:
stack.append(f'{call["sym"]["name"]} (in {"call["dso"]})')

> + if len(stack) != 0:
> + stack = stack[::-1]
> + _addThreadSample(pid, tid, thread_name, time_stamp, stack)
> +
> + # During perf record if -g is not used, the callchain is not available.
> + # In that case, the symbol and dso are available in the event parameters.
> + else:
> + func = param_dict['symbol'] if 'symbol' in param_dict else '[unknown]'
> + dso = param_dict['dso'] if 'dso' in param_dict else '[unknown]'
> + _addThreadSample(pid, tid, thread_name, time_stamp, [func + f' (in {dso})'])

Similarly:
f'{func} (in {dso})'

Thanks,
Ian

> --
> 2.34.1
>

2023-07-17 16:17:37

by Anup Sharma

[permalink] [raw]
Subject: Re: [PATCH v3 3/6] scripts: python: thread sample processing to create thread with schemas

On Wed, Jul 12, 2023 at 10:25:50AM -0700, Ian Rogers wrote:
> On Mon, Jul 10, 2023 at 4:13 PM Anup Sharma <[email protected]> wrote:
> >
> > The _addThreadSample function is responsible for adding a sample
> > to a specific thread. It first checks if the thread exists in
> > the thread_map dictionary.
> >
> > The markers structure defines the schema and data for
> > thread markers, including fields such as 'name',
> > 'startTime', 'endTime', 'phase', 'category', and 'data'.
> >
> > The samples structure defines the schema and data for thread
> > samples, including fields such as 'stack', 'time', and
> > 'responsiveness'.
> >
> > The frameTable structure defines the schema and data for frame
> > information, including fields such as 'location', 'relevantForJS',
> > 'innerWindowID', 'implementation', 'optimizations', 'line',
> > 'column', 'category', and 'subcategory'.
> >
> > The purpose of this function is to create a new thread structure
> > These structures provide a framework for storing and organizing
> > information related to thread markers, samples, frame details,
> > and stack information.
> >
> > The call stack is parsed to include function names and the associated
> > DSO, which are requires for creating json schema. Also few libaries
> > has been included which will be used in later commit.
>
> nit: s/requires/required.
> nit: I think the "Also few..." statement is out-of-date.

I apologize. I will ensure to address these in the next version.

> > Signed-off-by: Anup Sharma <[email protected]>
> > ---
> > .../scripts/python/firefox-gecko-converter.py | 70 +++++++++++++++++++
> > 1 file changed, 70 insertions(+)
> >
> > diff --git a/tools/perf/scripts/python/firefox-gecko-converter.py b/tools/perf/scripts/python/firefox-gecko-converter.py
> > index 765f1775cee5..0b8a86bdcab1 100644
> > --- a/tools/perf/scripts/python/firefox-gecko-converter.py
> > +++ b/tools/perf/scripts/python/firefox-gecko-converter.py
> > @@ -21,6 +21,7 @@ sys.path.append(os.environ['PERF_EXEC_PATH'] + \
> > from perf_trace_context import *
> > from Core import *
> >
>
> A comment and type information would be useful here. map is another
> word for a dictionary, which is somewhat implied. So the information
> here is that this data structure will hold something to do with
> threads. Perhaps say, "a map from TID to a Thread." A better variable
> name may then be tid_to_thread_map, but as map is implied you could
> do: tid_to_thread: Dict[int, Thread].

sure, I will rename this variable to tid_to_thread. However in my case
this needs to be kept a global variable, and I am not sure if I can
specify the data type for this variable since it creates a dependency
loop with the Thread class. I can leave it with just a comment
mentioning the type of key and value or I can write the type as
"Dict[int, Any]" which is not very useful.

> > +thread_map = {}
> > start_time = None
> >
> > def trace_end():
> > @@ -28,6 +29,57 @@ def trace_end():
> >
> > def process_event(param_dict):
> > global start_time
> > + global thread_map
> > +
> > + def _createThread(name, pid, tid):
> > + markers = {
> > + 'schema': {
> > + 'name': 0,
> > + 'startTime': 1,
> > + 'endTime': 2,
> > + 'phase': 3,
> > + 'category': 4,
> > + 'data': 5,
> > + },
> > + 'data': [],
> > + }
> > + samples = {
> > + 'schema': {
> > + 'stack': 0,
> > + 'time': 1,
> > + 'responsiveness': 2,
> > + },
> > + 'data': [],
> > + }
> > + frameTable = {
> > + 'schema': {
> > + 'location': 0,
> > + 'relevantForJS': 1,
> > + 'innerWindowID': 2,
> > + 'implementation': 3,
> > + 'optimizations': 4,
> > + 'line': 5,
> > + 'column': 6,
> > + 'category': 7,
> > + 'subcategory': 8,
> > + },
> > + 'data': [],
> > + }
> > + stackTable = {
> > + 'schema': {
> > + 'prefix': 0,
> > + 'frame': 1,
> > + },
> > + 'data': [],
> > + }
> > + stringTable = []
>
> Is there a missing return here?

No, I am not returning anything here.

> > +
> > + def _addThreadSample(pid, tid, threadName, time_stamp, stack):
> > + thread = thread_map.get(tid)
> > + if not thread:
> > + thread = _createThread(threadName, pid, tid)
> > + thread_map[tid] = thread
> > +
> > # Extract relevant information from the event parameters. The event parameters
> > # are in a dictionary:
> > time_stamp = (param_dict['sample']['time'] // 1000) / 1000
> > @@ -37,3 +89,21 @@ def process_event(param_dict):
> >
> > # Assume that start time is the time of the first event.
> > start_time = time_stamp if not start_time else start_time
> > +
> > + # Parse the callchain of the current sample into a stack array.
> > + if param_dict['callchain']:
> > + stack = []
> > + for call in param_dict['callchain']:
> > + if 'sym' not in call:
> > + continue
> > + stack.append(call['sym']['name'] + f' (in {call["dso"]})')
>
> Rather than mix an append and an f-string, just have the f-string ie:
> stack.append(f'{call["sym"]["name"]} (in {"call["dso"]})')

Thanks for this suggestion. I will make this change.

> > + if len(stack) != 0:
> > + stack = stack[::-1]
> > + _addThreadSample(pid, tid, thread_name, time_stamp, stack)
> > +
> > + # During perf record if -g is not used, the callchain is not available.
> > + # In that case, the symbol and dso are available in the event parameters.
> > + else:
> > + func = param_dict['symbol'] if 'symbol' in param_dict else '[unknown]'
> > + dso = param_dict['dso'] if 'dso' in param_dict else '[unknown]'
> > + _addThreadSample(pid, tid, thread_name, time_stamp, [func + f' (in {dso})'])
>
> Similarly:
> f'{func} (in {dso})'

Noted.

> Thanks,
> Ian
>
> > --
> > 2.34.1
> >