2024-01-22 07:11:16

by Ze Gao

[permalink] [raw]
Subject: [PATCH v2 0/4] perf sched: Fix task state report

No functional changes introduced in v2, just got v1 rebased onto
the tip of perf-tools-next for a clean apply.

---

Hi,

The problems of task state report in both libtraceevent
and perf sched has been reported in [1]. In short, they
parsed the wrong state due to relying on the outdated
hardcoded state string to interpret the raw bitmask
from the record, which left the messes to maintain the
backward compatibilities for both tools.

[1] has not managed to make itself into the kernel, the
problems and the solutions are well studied though.

Luckily, as suggested by Steven, perf/libtraceevent
records the print format, especially the __print_flags()
part of the in-kernel tracepoint sched_switch in its
metadata, and we have a chance to build the state str
on the fly by parsing it.

Now that libtraceevent has landed this solution in [2],
we now apply the same idea to perf as well.

Regards,

-- Ze

[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/linux-trace-devel/[email protected]/

Ze Gao (4):
perf sched: Sync state char array with the kernel
perf util: Add helpers to parse task state string from libtraceevent
perf util: Add evsel__taskstate() to parse the task state info instead
perf sched: Commit to evsel__taskstate() to parse task state info

tools/perf/builtin-sched.c | 57 +++------------
tools/perf/util/evsel.c | 146 +++++++++++++++++++++++++++++++++++++
tools/perf/util/evsel.h | 1 +
3 files changed, 157 insertions(+), 47 deletions(-)

--
2.41.0



2024-01-22 07:11:34

by Ze Gao

[permalink] [raw]
Subject: [PATCH v2 2/4] perf util: Add helpers to parse task state string from libtraceevent

Perf uses a hard coded string "RSDTtXZPI" to index the sched_switch
prev_state field raw bitmask value. This works well except for when
the kernel changes this string, in which case this will break again.

Instead we add a new way to parse task state string from tracepoint
print format already recorded by perf, which eliminates the further
dependencies with this hardcode and unmaintainable macro, and this
is exactly what libtraceevent[1] does for now.

So we borrow the print flags parsing logic from libtraceevent[1].
And in get_states(), we walk the print arguments until the
__print_flags() for the target state field is found, and use that to
build the states string for future parsing.

[1]: https://lore.kernel.org/linux-trace-devel/[email protected]/

Cc: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Ze Gao <[email protected]>
---
tools/perf/util/evsel.c | 112 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 112 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 6d7c9c58a9bc..e08294c51cd4 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2851,6 +2851,118 @@ u64 evsel__intval_common(struct evsel *evsel, struct perf_sample *sample, const
return field ? format_field__intval(field, sample, evsel->needs_swap) : 0;
}

+/*
+ * prev_state is of size long, which is 32 bits on 32 bit architectures.
+ * As it needs to have the same bits for both 32 bit and 64 bit architectures
+ * we can just assume that the flags we care about will all be within
+ * the 32 bits.
+ */
+#define MAX_STATE_BITS 32
+
+static const char *convert_sym(struct tep_print_flag_sym *sym)
+{
+ static char save_states[MAX_STATE_BITS + 1];
+
+ memset(save_states, 0, sizeof(save_states));
+
+ /* This is the flags for the prev_state_field, now make them into a string */
+ for (; sym; sym = sym->next) {
+ long bitmask = strtoul(sym->value, NULL, 0);
+ int i;
+
+ for (i = 0; !(bitmask & 1); i++)
+ bitmask >>= 1;
+
+ if (i >= MAX_STATE_BITS)
+ continue;
+
+ save_states[i] = sym->str[0];
+ }
+
+ return save_states;
+}
+
+static struct tep_print_arg_field *
+find_arg_field(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_arg_field *field;
+
+ if (!arg)
+ return NULL;
+
+ if (arg->type == TEP_PRINT_FIELD)
+ return &arg->field;
+
+ if (arg->type == TEP_PRINT_OP) {
+ field = find_arg_field(prev_state_field, arg->op.left);
+ if (field && field->field == prev_state_field)
+ return field;
+ field = find_arg_field(prev_state_field, arg->op.right);
+ if (field && field->field == prev_state_field)
+ return field;
+ }
+ return NULL;
+}
+
+static struct tep_print_flag_sym *
+test_flags(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_arg_field *field;
+
+ field = find_arg_field(prev_state_field, arg->flags.field);
+ if (!field)
+ return NULL;
+
+ return arg->flags.flags;
+}
+
+static struct tep_print_flag_sym *
+search_op(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_flag_sym *sym = NULL;
+
+ if (!arg)
+ return NULL;
+
+ if (arg->type == TEP_PRINT_OP) {
+ sym = search_op(prev_state_field, arg->op.left);
+ if (sym)
+ return sym;
+
+ sym = search_op(prev_state_field, arg->op.right);
+ if (sym)
+ return sym;
+ } else if (arg->type == TEP_PRINT_FLAGS) {
+ sym = test_flags(prev_state_field, arg);
+ }
+
+ return sym;
+}
+
+static __maybe_unused const char *get_states(struct tep_format_field *prev_state_field)
+{
+ struct tep_print_flag_sym *sym;
+ struct tep_print_arg *arg;
+ struct tep_event *event;
+
+ event = prev_state_field->event;
+
+ /*
+ * Look at the event format fields, and search for where
+ * the prev_state is parsed via the format flags.
+ */
+ for (arg = event->print_fmt.args; arg; arg = arg->next) {
+ /*
+ * Currently, the __print_flags() for the prev_state
+ * is embedded in operations, so they too must be
+ * searched.
+ */
+ sym = search_op(prev_state_field, arg);
+ if (sym)
+ return convert_sym(sym);
+ }
+ return NULL;
+}
#endif

bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
--
2.41.0


2024-01-22 07:12:10

by Ze Gao

[permalink] [raw]
Subject: [PATCH v2 4/4] perf sched: Commit to evsel__taskstate() to parse task state info

Now that we have evsel__taskstate() which no longer relies on the
hardcoded task state string and has good backward compatibility,
we have a good reason to use it.

Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
we remove them for good. And now we pass the state info back and
forth in a symbolic char which explains itself well instead.

Signed-off-by: Ze Gao <[email protected]>
---
tools/perf/builtin-sched.c | 46 +++++++++-----------------------------
1 file changed, 10 insertions(+), 36 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index ced6fffe8110..f1d62f6b6cab 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -92,13 +92,6 @@ struct sched_atom {
struct task_desc *wakee;
};

-#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
-
-/* task state bitmask, copied from include/linux/sched.h */
-#define TASK_RUNNING 0
-#define TASK_INTERRUPTIBLE 1
-#define TASK_UNINTERRUPTIBLE 2
-
enum thread_state {
THREAD_SLEEPING = 0,
THREAD_WAIT_CPU,
@@ -255,7 +248,7 @@ struct thread_runtime {
u64 total_preempt_time;
u64 total_delay_time;

- int last_state;
+ char last_state;

char shortname[3];
bool comm_changed;
@@ -425,7 +418,7 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
}

static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *task,
- u64 timestamp, u64 task_state __maybe_unused)
+ u64 timestamp, const char task_state __maybe_unused)
{
struct sched_atom *event = get_new_event(task, timestamp);

@@ -849,7 +842,7 @@ static int replay_switch_event(struct perf_sched *sched,
*next_comm = evsel__strval(evsel, sample, "next_comm");
const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
next_pid = evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
+ const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
struct task_desc *prev, __maybe_unused *next;
u64 timestamp0, timestamp = sample->time;
int cpu = sample->cpu;
@@ -1039,13 +1032,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
return 0;
}

-static char sched_out_state(u64 prev_state)
-{
- const char *str = TASK_STATE_TO_CHAR_STR;
-
- return str[prev_state];
-}
-
static int
add_sched_out_event(struct work_atoms *atoms,
char run_state,
@@ -1121,7 +1107,7 @@ static int latency_switch_event(struct perf_sched *sched,
{
const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
next_pid = evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
+ const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
struct work_atoms *out_events, *in_events;
struct thread *sched_out, *sched_in;
u64 timestamp0, timestamp = sample->time;
@@ -1157,7 +1143,7 @@ static int latency_switch_event(struct perf_sched *sched,
goto out_put;
}
}
- if (add_sched_out_event(out_events, sched_out_state(prev_state), timestamp))
+ if (add_sched_out_event(out_events, prev_state, timestamp))
return -1;

in_events = thread_atoms_search(&sched->atom_root, sched_in, &sched->cmp_pid);
@@ -2022,24 +2008,12 @@ static void timehist_header(struct perf_sched *sched)
printf("\n");
}

-static char task_state_char(struct thread *thread, int state)
-{
- static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
- unsigned bit = state ? ffs(state) : 0;
-
- /* 'I' for idle */
- if (thread__tid(thread) == 0)
- return 'I';
-
- return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
-}
-
static void timehist_print_sample(struct perf_sched *sched,
struct evsel *evsel,
struct perf_sample *sample,
struct addr_location *al,
struct thread *thread,
- u64 t, int state)
+ u64 t, const char state)
{
struct thread_runtime *tr = thread__priv(thread);
const char *next_comm = evsel__strval(evsel, sample, "next_comm");
@@ -2080,7 +2054,7 @@ static void timehist_print_sample(struct perf_sched *sched,
print_sched_time(tr->dt_run, 6);

if (sched->show_state)
- printf(" %5c ", task_state_char(thread, state));
+ printf(" %5c ", thread->tid == 0 ? 'I' : state);

if (sched->show_next) {
snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
@@ -2152,9 +2126,9 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
else if (r->last_time) {
u64 dt_wait = tprev - r->last_time;

- if (r->last_state == TASK_RUNNING)
+ if (r->last_state == 'R')
r->dt_preempt = dt_wait;
- else if (r->last_state == TASK_UNINTERRUPTIBLE)
+ else if (r->last_state == 'D')
r->dt_iowait = dt_wait;
else
r->dt_sleep = dt_wait;
@@ -2579,7 +2553,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
struct thread_runtime *tr = NULL;
u64 tprev, t = sample->time;
int rc = 0;
- int state = evsel__intval(evsel, sample, "prev_state");
+ const char state = evsel__taskstate(evsel, sample, "prev_state");

addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
--
2.41.0


2024-01-22 07:20:43

by Ze Gao

[permalink] [raw]
Subject: [PATCH v2 3/4] perf util: Add evsel__taskstate() to parse the task state info instead

Now that we have the __prinf_flags() parsing routines, we add a new
helper evsel__taskstate() to extract the task state info from the
recorded data.

Signed-off-by: Ze Gao <[email protected]>
---
tools/perf/util/evsel.c | 36 +++++++++++++++++++++++++++++++++++-
tools/perf/util/evsel.h | 1 +
2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index e08294c51cd4..4d14f14f2506 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2939,7 +2939,7 @@ search_op(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
return sym;
}

-static __maybe_unused const char *get_states(struct tep_format_field *prev_state_field)
+static const char *get_states(struct tep_format_field *prev_state_field)
{
struct tep_print_flag_sym *sym;
struct tep_print_arg *arg;
@@ -2963,6 +2963,40 @@ static __maybe_unused const char *get_states(struct tep_format_field *prev_state
}
return NULL;
}
+
+char evsel__taskstate(struct evsel *evsel, struct perf_sample *sample, const char *name)
+{
+ static struct tep_format_field *prev_state_field;
+ static const char *states;
+ struct tep_format_field *field;
+ unsigned long long val;
+ unsigned int bit;
+ char state = '?'; /* '?' denotes unknown task state */
+
+ field = evsel__field(evsel, name);
+
+ if (!field)
+ return state;
+
+ if (!states || field != prev_state_field) {
+ states = get_states(field);
+ if (!states)
+ return state;
+ prev_state_field = field;
+ }
+
+ /*
+ * Note since the kernel exposes TASK_REPORT_MAX to userspace
+ * to denote the 'preempted' state, we might as welll report
+ * 'R' for this case, which make senses to users as well.
+ *
+ * We can change this if we have a good reason in the future.
+ */
+ val = evsel__intval(evsel, sample, name);
+ bit = val ? ffs(val) : 0;
+ state = (!bit || bit > strlen(states)) ? 'R' : states[bit-1];
+ return state;
+}
#endif

bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index efbb6e848287..517cff431de2 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -339,6 +339,7 @@ struct perf_sample;
void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
u64 evsel__intval(struct evsel *evsel, struct perf_sample *sample, const char *name);
u64 evsel__intval_common(struct evsel *evsel, struct perf_sample *sample, const char *name);
+char evsel__taskstate(struct evsel *evsel, struct perf_sample *sample, const char *name);

static inline char *evsel__strval(struct evsel *evsel, struct perf_sample *sample, const char *name)
{
--
2.41.0


2024-01-23 01:57:29

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf sched: Commit to evsel__taskstate() to parse task state info

Hello,

On Sun, Jan 21, 2024 at 11:11 PM Ze Gao <[email protected]> wrote:
>
> Now that we have evsel__taskstate() which no longer relies on the
> hardcoded task state string and has good backward compatibility,
> we have a good reason to use it.
>
> Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
> we remove them for good. And now we pass the state info back and
> forth in a symbolic char which explains itself well instead.
>
> Signed-off-by: Ze Gao <[email protected]>
> ---
> tools/perf/builtin-sched.c | 46 +++++++++-----------------------------
> 1 file changed, 10 insertions(+), 36 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index ced6fffe8110..f1d62f6b6cab 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -92,13 +92,6 @@ struct sched_atom {
> struct task_desc *wakee;
> };
>
> -#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
> -
> -/* task state bitmask, copied from include/linux/sched.h */
> -#define TASK_RUNNING 0
> -#define TASK_INTERRUPTIBLE 1
> -#define TASK_UNINTERRUPTIBLE 2
> -
> enum thread_state {
> THREAD_SLEEPING = 0,
> THREAD_WAIT_CPU,
> @@ -255,7 +248,7 @@ struct thread_runtime {
> u64 total_preempt_time;
> u64 total_delay_time;
>
> - int last_state;
> + char last_state;
>
> char shortname[3];
> bool comm_changed;
> @@ -425,7 +418,7 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
> }
>
> static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *task,
> - u64 timestamp, u64 task_state __maybe_unused)
> + u64 timestamp, const char task_state __maybe_unused)
> {
> struct sched_atom *event = get_new_event(task, timestamp);
>
> @@ -849,7 +842,7 @@ static int replay_switch_event(struct perf_sched *sched,
> *next_comm = evsel__strval(evsel, sample, "next_comm");
> const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> next_pid = evsel__intval(evsel, sample, "next_pid");
> - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> struct task_desc *prev, __maybe_unused *next;
> u64 timestamp0, timestamp = sample->time;
> int cpu = sample->cpu;
> @@ -1039,13 +1032,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
> return 0;
> }
>
> -static char sched_out_state(u64 prev_state)
> -{
> - const char *str = TASK_STATE_TO_CHAR_STR;
> -
> - return str[prev_state];
> -}
> -
> static int
> add_sched_out_event(struct work_atoms *atoms,
> char run_state,
> @@ -1121,7 +1107,7 @@ static int latency_switch_event(struct perf_sched *sched,
> {
> const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> next_pid = evsel__intval(evsel, sample, "next_pid");
> - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> struct work_atoms *out_events, *in_events;
> struct thread *sched_out, *sched_in;
> u64 timestamp0, timestamp = sample->time;
> @@ -1157,7 +1143,7 @@ static int latency_switch_event(struct perf_sched *sched,
> goto out_put;
> }
> }
> - if (add_sched_out_event(out_events, sched_out_state(prev_state), timestamp))
> + if (add_sched_out_event(out_events, prev_state, timestamp))
> return -1;
>
> in_events = thread_atoms_search(&sched->atom_root, sched_in, &sched->cmp_pid);
> @@ -2022,24 +2008,12 @@ static void timehist_header(struct perf_sched *sched)
> printf("\n");
> }
>
> -static char task_state_char(struct thread *thread, int state)
> -{
> - static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
> - unsigned bit = state ? ffs(state) : 0;
> -
> - /* 'I' for idle */
> - if (thread__tid(thread) == 0)
> - return 'I';
> -
> - return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
> -}
> -
> static void timehist_print_sample(struct perf_sched *sched,
> struct evsel *evsel,
> struct perf_sample *sample,
> struct addr_location *al,
> struct thread *thread,
> - u64 t, int state)
> + u64 t, const char state)
> {
> struct thread_runtime *tr = thread__priv(thread);
> const char *next_comm = evsel__strval(evsel, sample, "next_comm");
> @@ -2080,7 +2054,7 @@ static void timehist_print_sample(struct perf_sched *sched,
> print_sched_time(tr->dt_run, 6);
>
> if (sched->show_state)
> - printf(" %5c ", task_state_char(thread, state));
> + printf(" %5c ", thread->tid == 0 ? 'I' : state);

This resulted in a build error with reference count checker.

$ make EXTRA_CFLAGS=-DREFCNT_CHECKING=1
...
builtin-sched.c: In function ‘timehist_print_sample’:
builtin-sched.c:2057:39: error: ‘struct thread’ has no member named ‘tid’
2057 | printf(" %5c ", thread->tid == 0 ? 'I' : state);
|

The struct thread is protected by the refcount checker so
you should not access the members directly. Instead,
please use a help function like thread__tid().

Thanks,
Namhyung

>
> if (sched->show_next) {
> snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
> @@ -2152,9 +2126,9 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
> else if (r->last_time) {
> u64 dt_wait = tprev - r->last_time;
>
> - if (r->last_state == TASK_RUNNING)
> + if (r->last_state == 'R')
> r->dt_preempt = dt_wait;
> - else if (r->last_state == TASK_UNINTERRUPTIBLE)
> + else if (r->last_state == 'D')
> r->dt_iowait = dt_wait;
> else
> r->dt_sleep = dt_wait;
> @@ -2579,7 +2553,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
> struct thread_runtime *tr = NULL;
> u64 tprev, t = sample->time;
> int rc = 0;
> - int state = evsel__intval(evsel, sample, "prev_state");
> + const char state = evsel__taskstate(evsel, sample, "prev_state");
>
> addr_location__init(&al);
> if (machine__resolve(machine, &al, sample) < 0) {
> --
> 2.41.0
>

2024-01-23 02:39:58

by Ze Gao

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf sched: Commit to evsel__taskstate() to parse task state info

On Tue, Jan 23, 2024 at 8:38 AM Namhyung Kim <[email protected]> wrote:
>
> Hello,
>
> On Sun, Jan 21, 2024 at 11:11 PM Ze Gao <[email protected]> wrote:
> >
> > Now that we have evsel__taskstate() which no longer relies on the
> > hardcoded task state string and has good backward compatibility,
> > we have a good reason to use it.
> >
> > Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
> > we remove them for good. And now we pass the state info back and
> > forth in a symbolic char which explains itself well instead.
> >
> > Signed-off-by: Ze Gao <[email protected]>
> > ---
> > tools/perf/builtin-sched.c | 46 +++++++++-----------------------------
> > 1 file changed, 10 insertions(+), 36 deletions(-)
> >
> > diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> > index ced6fffe8110..f1d62f6b6cab 100644
> > --- a/tools/perf/builtin-sched.c
> > +++ b/tools/perf/builtin-sched.c
> > @@ -92,13 +92,6 @@ struct sched_atom {
> > struct task_desc *wakee;
> > };
> >
> > -#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
> > -
> > -/* task state bitmask, copied from include/linux/sched.h */
> > -#define TASK_RUNNING 0
> > -#define TASK_INTERRUPTIBLE 1
> > -#define TASK_UNINTERRUPTIBLE 2
> > -
> > enum thread_state {
> > THREAD_SLEEPING = 0,
> > THREAD_WAIT_CPU,
> > @@ -255,7 +248,7 @@ struct thread_runtime {
> > u64 total_preempt_time;
> > u64 total_delay_time;
> >
> > - int last_state;
> > + char last_state;
> >
> > char shortname[3];
> > bool comm_changed;
> > @@ -425,7 +418,7 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
> > }
> >
> > static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *task,
> > - u64 timestamp, u64 task_state __maybe_unused)
> > + u64 timestamp, const char task_state __maybe_unused)
> > {
> > struct sched_atom *event = get_new_event(task, timestamp);
> >
> > @@ -849,7 +842,7 @@ static int replay_switch_event(struct perf_sched *sched,
> > *next_comm = evsel__strval(evsel, sample, "next_comm");
> > const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> > next_pid = evsel__intval(evsel, sample, "next_pid");
> > - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> > + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> > struct task_desc *prev, __maybe_unused *next;
> > u64 timestamp0, timestamp = sample->time;
> > int cpu = sample->cpu;
> > @@ -1039,13 +1032,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
> > return 0;
> > }
> >
> > -static char sched_out_state(u64 prev_state)
> > -{
> > - const char *str = TASK_STATE_TO_CHAR_STR;
> > -
> > - return str[prev_state];
> > -}
> > -
> > static int
> > add_sched_out_event(struct work_atoms *atoms,
> > char run_state,
> > @@ -1121,7 +1107,7 @@ static int latency_switch_event(struct perf_sched *sched,
> > {
> > const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> > next_pid = evsel__intval(evsel, sample, "next_pid");
> > - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> > + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> > struct work_atoms *out_events, *in_events;
> > struct thread *sched_out, *sched_in;
> > u64 timestamp0, timestamp = sample->time;
> > @@ -1157,7 +1143,7 @@ static int latency_switch_event(struct perf_sched *sched,
> > goto out_put;
> > }
> > }
> > - if (add_sched_out_event(out_events, sched_out_state(prev_state), timestamp))
> > + if (add_sched_out_event(out_events, prev_state, timestamp))
> > return -1;
> >
> > in_events = thread_atoms_search(&sched->atom_root, sched_in, &sched->cmp_pid);
> > @@ -2022,24 +2008,12 @@ static void timehist_header(struct perf_sched *sched)
> > printf("\n");
> > }
> >
> > -static char task_state_char(struct thread *thread, int state)
> > -{
> > - static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
> > - unsigned bit = state ? ffs(state) : 0;
> > -
> > - /* 'I' for idle */
> > - if (thread__tid(thread) == 0)
> > - return 'I';
> > -
> > - return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
> > -}
> > -
> > static void timehist_print_sample(struct perf_sched *sched,
> > struct evsel *evsel,
> > struct perf_sample *sample,
> > struct addr_location *al,
> > struct thread *thread,
> > - u64 t, int state)
> > + u64 t, const char state)
> > {
> > struct thread_runtime *tr = thread__priv(thread);
> > const char *next_comm = evsel__strval(evsel, sample, "next_comm");
> > @@ -2080,7 +2054,7 @@ static void timehist_print_sample(struct perf_sched *sched,
> > print_sched_time(tr->dt_run, 6);
> >
> > if (sched->show_state)
> > - printf(" %5c ", task_state_char(thread, state));
> > + printf(" %5c ", thread->tid == 0 ? 'I' : state);
>
> This resulted in a build error with reference count checker.
>
> $ make EXTRA_CFLAGS=-DREFCNT_CHECKING=1
> ...
> builtin-sched.c: In function ‘timehist_print_sample’:
> builtin-sched.c:2057:39: error: ‘struct thread’ has no member named ‘tid’
> 2057 | printf(" %5c ", thread->tid == 0 ? 'I' : state);
> |
>
> The struct thread is protected by the refcount checker so
> you should not access the members directly. Instead,
> please use a help function like thread__tid().

Thanks for pointing this out. Commit ee84a3032b74("
perf thread: Add accessor functions for thread") introduced
this accessor which i overlooked. My bad :(.

Will send a fix in-reply-to this patch separately.

Thanks,
-- Ze

> Thanks,
> Namhyung
>
> >
> > if (sched->show_next) {
> > snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
> > @@ -2152,9 +2126,9 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
> > else if (r->last_time) {
> > u64 dt_wait = tprev - r->last_time;
> >
> > - if (r->last_state == TASK_RUNNING)
> > + if (r->last_state == 'R')
> > r->dt_preempt = dt_wait;
> > - else if (r->last_state == TASK_UNINTERRUPTIBLE)
> > + else if (r->last_state == 'D')
> > r->dt_iowait = dt_wait;
> > else
> > r->dt_sleep = dt_wait;
> > @@ -2579,7 +2553,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
> > struct thread_runtime *tr = NULL;
> > u64 tprev, t = sample->time;
> > int rc = 0;
> > - int state = evsel__intval(evsel, sample, "prev_state");
> > + const char state = evsel__taskstate(evsel, sample, "prev_state");
> >
> > addr_location__init(&al);
> > if (machine__resolve(machine, &al, sample) < 0) {
> > --
> > 2.41.0
> >

2024-01-23 03:03:51

by Ze Gao

[permalink] [raw]
Subject: [PATCH v2 4/4] perf sched: Commit to evsel__taskstate() to parse task state info

Now that we have evsel__taskstate() which no longer relies on the
hardcoded task state string and has good backward compatibility,
we have a good reason to use it.

Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
we remove them for good. And now we pass the state info back and
forth in a symbolic char which explains itself well instead.

Signed-off-by: Ze Gao <[email protected]>
---
tools/perf/builtin-sched.c | 46 +++++++++-----------------------------
1 file changed, 10 insertions(+), 36 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index ced6fffe8110..42d5fc5d6b7b 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -92,13 +92,6 @@ struct sched_atom {
struct task_desc *wakee;
};

-#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
-
-/* task state bitmask, copied from include/linux/sched.h */
-#define TASK_RUNNING 0
-#define TASK_INTERRUPTIBLE 1
-#define TASK_UNINTERRUPTIBLE 2
-
enum thread_state {
THREAD_SLEEPING = 0,
THREAD_WAIT_CPU,
@@ -255,7 +248,7 @@ struct thread_runtime {
u64 total_preempt_time;
u64 total_delay_time;

- int last_state;
+ char last_state;

char shortname[3];
bool comm_changed;
@@ -425,7 +418,7 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
}

static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *task,
- u64 timestamp, u64 task_state __maybe_unused)
+ u64 timestamp, const char task_state __maybe_unused)
{
struct sched_atom *event = get_new_event(task, timestamp);

@@ -849,7 +842,7 @@ static int replay_switch_event(struct perf_sched *sched,
*next_comm = evsel__strval(evsel, sample, "next_comm");
const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
next_pid = evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
+ const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
struct task_desc *prev, __maybe_unused *next;
u64 timestamp0, timestamp = sample->time;
int cpu = sample->cpu;
@@ -1039,13 +1032,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
return 0;
}

-static char sched_out_state(u64 prev_state)
-{
- const char *str = TASK_STATE_TO_CHAR_STR;
-
- return str[prev_state];
-}
-
static int
add_sched_out_event(struct work_atoms *atoms,
char run_state,
@@ -1121,7 +1107,7 @@ static int latency_switch_event(struct perf_sched *sched,
{
const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
next_pid = evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
+ const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
struct work_atoms *out_events, *in_events;
struct thread *sched_out, *sched_in;
u64 timestamp0, timestamp = sample->time;
@@ -1157,7 +1143,7 @@ static int latency_switch_event(struct perf_sched *sched,
goto out_put;
}
}
- if (add_sched_out_event(out_events, sched_out_state(prev_state), timestamp))
+ if (add_sched_out_event(out_events, prev_state, timestamp))
return -1;

in_events = thread_atoms_search(&sched->atom_root, sched_in, &sched->cmp_pid);
@@ -2022,24 +2008,12 @@ static void timehist_header(struct perf_sched *sched)
printf("\n");
}

-static char task_state_char(struct thread *thread, int state)
-{
- static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
- unsigned bit = state ? ffs(state) : 0;
-
- /* 'I' for idle */
- if (thread__tid(thread) == 0)
- return 'I';
-
- return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
-}
-
static void timehist_print_sample(struct perf_sched *sched,
struct evsel *evsel,
struct perf_sample *sample,
struct addr_location *al,
struct thread *thread,
- u64 t, int state)
+ u64 t, const char state)
{
struct thread_runtime *tr = thread__priv(thread);
const char *next_comm = evsel__strval(evsel, sample, "next_comm");
@@ -2080,7 +2054,7 @@ static void timehist_print_sample(struct perf_sched *sched,
print_sched_time(tr->dt_run, 6);

if (sched->show_state)
- printf(" %5c ", task_state_char(thread, state));
+ printf(" %5c ", thread__tid(thread) == 0 ? 'I' : state);

if (sched->show_next) {
snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
@@ -2152,9 +2126,9 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
else if (r->last_time) {
u64 dt_wait = tprev - r->last_time;

- if (r->last_state == TASK_RUNNING)
+ if (r->last_state == 'R')
r->dt_preempt = dt_wait;
- else if (r->last_state == TASK_UNINTERRUPTIBLE)
+ else if (r->last_state == 'D')
r->dt_iowait = dt_wait;
else
r->dt_sleep = dt_wait;
@@ -2579,7 +2553,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
struct thread_runtime *tr = NULL;
u64 tprev, t = sample->time;
int rc = 0;
- int state = evsel__intval(evsel, sample, "prev_state");
+ const char state = evsel__taskstate(evsel, sample, "prev_state");

addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
--
2.41.0


2024-01-23 07:03:51

by Ze Gao

[permalink] [raw]
Subject: [PATCH] perf evsel: Rename get_states() to parse_task_states() and make it public

Since get_states() assumes the existence of libtraceevent, so move
to where it should belong, i.e, util/trace-event-parse.c, and also
rename it to parse_task_states().

Leave evsel_getstate() untouched as it fits well in the evsel
category.

Also make some necessary tweaks for python support, and get it
verified with: perf test python.

Signed-off-by: Ze Gao <[email protected]>
---

Hi Namhyung,

This is the promised refactoring patch for [1], and hopefully it
works as what we all expected.

[1]: https://lore.kernel.org/all/[email protected]/


Regards,
-- Ze

tools/perf/Makefile.perf | 2 +-
tools/perf/util/evsel.c | 115 +---------------------------
tools/perf/util/python-ext-sources | 1 +
tools/perf/util/setup.py | 1 +
tools/perf/util/trace-event-parse.c | 113 +++++++++++++++++++++++++++
tools/perf/util/trace-event.h | 3 +
6 files changed, 120 insertions(+), 115 deletions(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 27e7c478880f..a5d274bd804b 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -370,7 +370,7 @@ python-clean := $(call QUIET_CLEAN, python) $(RM) -r $(PYTHON_EXTBUILD) $(OUTPUT
ifeq ($(CONFIG_LIBTRACEEVENT),y)
PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources)
else
- PYTHON_EXT_SRCS := $(shell grep -v ^\#\\\|util/trace-event.c util/python-ext-sources)
+ PYTHON_EXT_SRCS := $(shell grep -v ^\#\\\|util/trace-event.c\\\|util/trace-event-parse.c util/python-ext-sources)
endif

PYTHON_EXT_DEPS := util/python-ext-sources util/setup.py $(LIBAPI)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 4d14f14f2506..9e67324b1608 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2851,119 +2851,6 @@ u64 evsel__intval_common(struct evsel *evsel, struct perf_sample *sample, const
return field ? format_field__intval(field, sample, evsel->needs_swap) : 0;
}

-/*
- * prev_state is of size long, which is 32 bits on 32 bit architectures.
- * As it needs to have the same bits for both 32 bit and 64 bit architectures
- * we can just assume that the flags we care about will all be within
- * the 32 bits.
- */
-#define MAX_STATE_BITS 32
-
-static const char *convert_sym(struct tep_print_flag_sym *sym)
-{
- static char save_states[MAX_STATE_BITS + 1];
-
- memset(save_states, 0, sizeof(save_states));
-
- /* This is the flags for the prev_state_field, now make them into a string */
- for (; sym; sym = sym->next) {
- long bitmask = strtoul(sym->value, NULL, 0);
- int i;
-
- for (i = 0; !(bitmask & 1); i++)
- bitmask >>= 1;
-
- if (i >= MAX_STATE_BITS)
- continue;
-
- save_states[i] = sym->str[0];
- }
-
- return save_states;
-}
-
-static struct tep_print_arg_field *
-find_arg_field(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
-{
- struct tep_print_arg_field *field;
-
- if (!arg)
- return NULL;
-
- if (arg->type == TEP_PRINT_FIELD)
- return &arg->field;
-
- if (arg->type == TEP_PRINT_OP) {
- field = find_arg_field(prev_state_field, arg->op.left);
- if (field && field->field == prev_state_field)
- return field;
- field = find_arg_field(prev_state_field, arg->op.right);
- if (field && field->field == prev_state_field)
- return field;
- }
- return NULL;
-}
-
-static struct tep_print_flag_sym *
-test_flags(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
-{
- struct tep_print_arg_field *field;
-
- field = find_arg_field(prev_state_field, arg->flags.field);
- if (!field)
- return NULL;
-
- return arg->flags.flags;
-}
-
-static struct tep_print_flag_sym *
-search_op(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
-{
- struct tep_print_flag_sym *sym = NULL;
-
- if (!arg)
- return NULL;
-
- if (arg->type == TEP_PRINT_OP) {
- sym = search_op(prev_state_field, arg->op.left);
- if (sym)
- return sym;
-
- sym = search_op(prev_state_field, arg->op.right);
- if (sym)
- return sym;
- } else if (arg->type == TEP_PRINT_FLAGS) {
- sym = test_flags(prev_state_field, arg);
- }
-
- return sym;
-}
-
-static const char *get_states(struct tep_format_field *prev_state_field)
-{
- struct tep_print_flag_sym *sym;
- struct tep_print_arg *arg;
- struct tep_event *event;
-
- event = prev_state_field->event;
-
- /*
- * Look at the event format fields, and search for where
- * the prev_state is parsed via the format flags.
- */
- for (arg = event->print_fmt.args; arg; arg = arg->next) {
- /*
- * Currently, the __print_flags() for the prev_state
- * is embedded in operations, so they too must be
- * searched.
- */
- sym = search_op(prev_state_field, arg);
- if (sym)
- return convert_sym(sym);
- }
- return NULL;
-}
-
char evsel__taskstate(struct evsel *evsel, struct perf_sample *sample, const char *name)
{
static struct tep_format_field *prev_state_field;
@@ -2979,7 +2866,7 @@ char evsel__taskstate(struct evsel *evsel, struct perf_sample *sample, const cha
return state;

if (!states || field != prev_state_field) {
- states = get_states(field);
+ states = parse_task_states(field);
if (!states)
return state;
prev_state_field = field;
diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources
index 593b660ec75e..1bec945f4838 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -31,6 +31,7 @@ util/counts.c
util/print_binary.c
util/strlist.c
util/trace-event.c
+util/trace-event-parse.c
../lib/rbtree.c
util/string.c
util/symbol_fprintf.c
diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py
index 79d5e2955f85..3107f5aa8c9a 100644
--- a/tools/perf/util/setup.py
+++ b/tools/perf/util/setup.py
@@ -85,6 +85,7 @@ if '-DHAVE_LIBTRACEEVENT' in cflags:
extra_libraries += [ 'traceevent' ]
else:
ext_sources.remove('util/trace-event.c')
+ ext_sources.remove('util/trace-event-parse.c')

# use full paths with source files
ext_sources = list(map(lambda x: '%s/%s' % (src_perf, x) , ext_sources))
diff --git a/tools/perf/util/trace-event-parse.c b/tools/perf/util/trace-event-parse.c
index 2d3c2576bab7..f0332bd3a501 100644
--- a/tools/perf/util/trace-event-parse.c
+++ b/tools/perf/util/trace-event-parse.c
@@ -122,6 +122,119 @@ void event_format__print(struct tep_event *event,
return event_format__fprintf(event, cpu, data, size, stdout);
}

+/*
+ * prev_state is of size long, which is 32 bits on 32 bit architectures.
+ * As it needs to have the same bits for both 32 bit and 64 bit architectures
+ * we can just assume that the flags we care about will all be within
+ * the 32 bits.
+ */
+#define MAX_STATE_BITS 32
+
+static const char *convert_sym(struct tep_print_flag_sym *sym)
+{
+ static char save_states[MAX_STATE_BITS + 1];
+
+ memset(save_states, 0, sizeof(save_states));
+
+ /* This is the flags for the prev_state_field, now make them into a string */
+ for (; sym; sym = sym->next) {
+ long bitmask = strtoul(sym->value, NULL, 0);
+ int i;
+
+ for (i = 0; !(bitmask & 1); i++)
+ bitmask >>= 1;
+
+ if (i >= MAX_STATE_BITS)
+ continue;
+
+ save_states[i] = sym->str[0];
+ }
+
+ return save_states;
+}
+
+static struct tep_print_arg_field *
+find_arg_field(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_arg_field *field;
+
+ if (!arg)
+ return NULL;
+
+ if (arg->type == TEP_PRINT_FIELD)
+ return &arg->field;
+
+ if (arg->type == TEP_PRINT_OP) {
+ field = find_arg_field(prev_state_field, arg->op.left);
+ if (field && field->field == prev_state_field)
+ return field;
+ field = find_arg_field(prev_state_field, arg->op.right);
+ if (field && field->field == prev_state_field)
+ return field;
+ }
+ return NULL;
+}
+
+static struct tep_print_flag_sym *
+test_flags(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_arg_field *field;
+
+ field = find_arg_field(prev_state_field, arg->flags.field);
+ if (!field)
+ return NULL;
+
+ return arg->flags.flags;
+}
+
+static struct tep_print_flag_sym *
+search_op(struct tep_format_field *prev_state_field, struct tep_print_arg *arg)
+{
+ struct tep_print_flag_sym *sym = NULL;
+
+ if (!arg)
+ return NULL;
+
+ if (arg->type == TEP_PRINT_OP) {
+ sym = search_op(prev_state_field, arg->op.left);
+ if (sym)
+ return sym;
+
+ sym = search_op(prev_state_field, arg->op.right);
+ if (sym)
+ return sym;
+ } else if (arg->type == TEP_PRINT_FLAGS) {
+ sym = test_flags(prev_state_field, arg);
+ }
+
+ return sym;
+}
+
+const char *parse_task_states(struct tep_format_field *state_field)
+{
+ struct tep_print_flag_sym *sym;
+ struct tep_print_arg *arg;
+ struct tep_event *event;
+
+ event = state_field->event;
+
+ /*
+ * Look at the event format fields, and search for where
+ * the prev_state is parsed via the format flags.
+ */
+ for (arg = event->print_fmt.args; arg; arg = arg->next) {
+ /*
+ * Currently, the __print_flags() for the prev_state
+ * is embedded in operations, so they too must be
+ * searched.
+ */
+ sym = search_op(state_field, arg);
+ if (sym)
+ return convert_sym(sym);
+ }
+ return NULL;
+}
+
void parse_ftrace_printk(struct tep_handle *pevent,
char *file, unsigned int size __maybe_unused)
{
diff --git a/tools/perf/util/trace-event.h b/tools/perf/util/trace-event.h
index a69ee29419f3..bbf8b26bc8da 100644
--- a/tools/perf/util/trace-event.h
+++ b/tools/perf/util/trace-event.h
@@ -15,6 +15,7 @@ struct perf_tool;
struct thread;
struct tep_plugin_list;
struct evsel;
+struct tep_format_field;

struct trace_event {
struct tep_handle *pevent;
@@ -51,6 +52,8 @@ int parse_event_file(struct tep_handle *pevent,
unsigned long long
raw_field_value(struct tep_event *event, const char *name, void *data);

+const char *parse_task_states(struct tep_format_field *state_field);
+
void parse_proc_kallsyms(struct tep_handle *pevent, char *file, unsigned int size);
void parse_ftrace_printk(struct tep_handle *pevent, char *file, unsigned int size);
void parse_saved_cmdline(struct tep_handle *pevent, char *file, unsigned int size);
--
2.41.0


2024-01-24 00:17:50

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] perf sched: Commit to evsel__taskstate() to parse task state info

On Mon, Jan 22, 2024 at 6:24 PM Ze Gao <[email protected]> wrote:
>
> Now that we have evsel__taskstate() which no longer relies on the
> hardcoded task state string and has good backward compatibility,
> we have a good reason to use it.
>
> Note TASK_STATE_TO_CHAR_STR and task bitmasks are useless now so
> we remove them for good. And now we pass the state info back and
> forth in a symbolic char which explains itself well instead.
>
> Signed-off-by: Ze Gao <[email protected]>

Thanks for the update!
Namhyung


> ---
> tools/perf/builtin-sched.c | 46 +++++++++-----------------------------
> 1 file changed, 10 insertions(+), 36 deletions(-)
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index ced6fffe8110..42d5fc5d6b7b 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -92,13 +92,6 @@ struct sched_atom {
> struct task_desc *wakee;
> };
>
> -#define TASK_STATE_TO_CHAR_STR "RSDTtXZPI"
> -
> -/* task state bitmask, copied from include/linux/sched.h */
> -#define TASK_RUNNING 0
> -#define TASK_INTERRUPTIBLE 1
> -#define TASK_UNINTERRUPTIBLE 2
> -
> enum thread_state {
> THREAD_SLEEPING = 0,
> THREAD_WAIT_CPU,
> @@ -255,7 +248,7 @@ struct thread_runtime {
> u64 total_preempt_time;
> u64 total_delay_time;
>
> - int last_state;
> + char last_state;
>
> char shortname[3];
> bool comm_changed;
> @@ -425,7 +418,7 @@ static void add_sched_event_wakeup(struct perf_sched *sched, struct task_desc *t
> }
>
> static void add_sched_event_sleep(struct perf_sched *sched, struct task_desc *task,
> - u64 timestamp, u64 task_state __maybe_unused)
> + u64 timestamp, const char task_state __maybe_unused)
> {
> struct sched_atom *event = get_new_event(task, timestamp);
>
> @@ -849,7 +842,7 @@ static int replay_switch_event(struct perf_sched *sched,
> *next_comm = evsel__strval(evsel, sample, "next_comm");
> const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> next_pid = evsel__intval(evsel, sample, "next_pid");
> - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> struct task_desc *prev, __maybe_unused *next;
> u64 timestamp0, timestamp = sample->time;
> int cpu = sample->cpu;
> @@ -1039,13 +1032,6 @@ static int thread_atoms_insert(struct perf_sched *sched, struct thread *thread)
> return 0;
> }
>
> -static char sched_out_state(u64 prev_state)
> -{
> - const char *str = TASK_STATE_TO_CHAR_STR;
> -
> - return str[prev_state];
> -}
> -
> static int
> add_sched_out_event(struct work_atoms *atoms,
> char run_state,
> @@ -1121,7 +1107,7 @@ static int latency_switch_event(struct perf_sched *sched,
> {
> const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
> next_pid = evsel__intval(evsel, sample, "next_pid");
> - const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
> + const char prev_state = evsel__taskstate(evsel, sample, "prev_state");
> struct work_atoms *out_events, *in_events;
> struct thread *sched_out, *sched_in;
> u64 timestamp0, timestamp = sample->time;
> @@ -1157,7 +1143,7 @@ static int latency_switch_event(struct perf_sched *sched,
> goto out_put;
> }
> }
> - if (add_sched_out_event(out_events, sched_out_state(prev_state), timestamp))
> + if (add_sched_out_event(out_events, prev_state, timestamp))
> return -1;
>
> in_events = thread_atoms_search(&sched->atom_root, sched_in, &sched->cmp_pid);
> @@ -2022,24 +2008,12 @@ static void timehist_header(struct perf_sched *sched)
> printf("\n");
> }
>
> -static char task_state_char(struct thread *thread, int state)
> -{
> - static const char state_to_char[] = TASK_STATE_TO_CHAR_STR;
> - unsigned bit = state ? ffs(state) : 0;
> -
> - /* 'I' for idle */
> - if (thread__tid(thread) == 0)
> - return 'I';
> -
> - return bit < sizeof(state_to_char) - 1 ? state_to_char[bit] : '?';
> -}
> -
> static void timehist_print_sample(struct perf_sched *sched,
> struct evsel *evsel,
> struct perf_sample *sample,
> struct addr_location *al,
> struct thread *thread,
> - u64 t, int state)
> + u64 t, const char state)
> {
> struct thread_runtime *tr = thread__priv(thread);
> const char *next_comm = evsel__strval(evsel, sample, "next_comm");
> @@ -2080,7 +2054,7 @@ static void timehist_print_sample(struct perf_sched *sched,
> print_sched_time(tr->dt_run, 6);
>
> if (sched->show_state)
> - printf(" %5c ", task_state_char(thread, state));
> + printf(" %5c ", thread__tid(thread) == 0 ? 'I' : state);
>
> if (sched->show_next) {
> snprintf(nstr, sizeof(nstr), "next: %s[%d]", next_comm, next_pid);
> @@ -2152,9 +2126,9 @@ static void timehist_update_runtime_stats(struct thread_runtime *r,
> else if (r->last_time) {
> u64 dt_wait = tprev - r->last_time;
>
> - if (r->last_state == TASK_RUNNING)
> + if (r->last_state == 'R')
> r->dt_preempt = dt_wait;
> - else if (r->last_state == TASK_UNINTERRUPTIBLE)
> + else if (r->last_state == 'D')
> r->dt_iowait = dt_wait;
> else
> r->dt_sleep = dt_wait;
> @@ -2579,7 +2553,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
> struct thread_runtime *tr = NULL;
> u64 tprev, t = sample->time;
> int rc = 0;
> - int state = evsel__intval(evsel, sample, "prev_state");
> + const char state = evsel__taskstate(evsel, sample, "prev_state");
>
> addr_location__init(&al);
> if (machine__resolve(machine, &al, sample) < 0) {
> --
> 2.41.0
>

2024-01-24 00:38:54

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] perf sched: Fix task state report

On Mon, 22 Jan 2024 02:08:55 -0500, Ze Gao wrote:
> No functional changes introduced in v2, just got v1 rebased onto
> the tip of perf-tools-next for a clean apply.
>

Applied to perf-tools-next except for the last refactoring one.
I'll test it separately.

Best regards,
--
Namhyung Kim <[email protected]>

2024-01-24 02:08:55

by Ze Gao

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] perf sched: Fix task state report

On Wed, Jan 24, 2024 at 8:38 AM Namhyung Kim <[email protected]> wrote:
>
> On Mon, 22 Jan 2024 02:08:55 -0500, Ze Gao wrote:
> > No functional changes introduced in v2, just got v1 rebased onto
> > the tip of perf-tools-next for a clean apply.
> >
>
> Applied to perf-tools-next except for the last refactoring one.
> I'll test it separately.

Awesome!

Thanks,
-- Ze

> Best regards,
> --
> Namhyung Kim <[email protected]>

2024-02-02 20:58:01

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf evsel: Rename get_states() to parse_task_states() and make it public

On Tue, 23 Jan 2024 02:02:11 -0500, Ze Gao wrote:
> Since get_states() assumes the existence of libtraceevent, so move
> to where it should belong, i.e, util/trace-event-parse.c, and also
> rename it to parse_task_states().
>
> Leave evsel_getstate() untouched as it fits well in the evsel
> category.
>
> [...]

Applied to perf-tools-next now, thanks for your work!

Best regards,
--
Namhyung Kim <[email protected]>

2024-02-04 02:03:39

by Ze Gao

[permalink] [raw]
Subject: Re: [PATCH] perf evsel: Rename get_states() to parse_task_states() and make it public

On Sat, Feb 3, 2024 at 4:57 AM Namhyung Kim <[email protected]> wrote:
>
> On Tue, 23 Jan 2024 02:02:11 -0500, Ze Gao wrote:
> > Since get_states() assumes the existence of libtraceevent, so move
> > to where it should belong, i.e, util/trace-event-parse.c, and also
> > rename it to parse_task_states().
> >
> > Leave evsel_getstate() untouched as it fits well in the evsel
> > category.
> >
> > [...]
>
> Applied to perf-tools-next now, thanks for your work!

Thanks for your update! :)

Cheers,
-- Ze
> --
> Namhyung Kim <[email protected]>