2023-03-17 10:05:55

by Hangliang Lai

[permalink] [raw]
Subject: [PATCH] perf top: expand the range of multithreaded phase

In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
donw_read and down_write can work to handle concurrency problems. Then top
use perf_set_singlethreaded and switch to single threaded phase, assuming
that no thread concurrency will happen later.

However, a UAF problem could occur in perf top in single threaded phase,
The concurrent procedure is like this:

display_thread process_thread
-------------- --------------

thread__comm_len
-> thread__comm_str
-> __thread__comm_str(thread)
thread__delete
-> comm__free
-> comm_str__put
-> zfree(&cs->str)
-> thread->comm_len = strlen(comm);

Since in single thread phase, perf_singlethreaded is true, down_read and
down_write can not work to avoid concurrency problems.

This patch put perf_set_singlethreaded to the function tail to expand the
multithreaded phase range, make display_thread and process_thread run
safe.

Signed-off-by: Hangliang Lai <[email protected]>
Reported-by: Wenyu Liu <[email protected]>
Reviewed-by: Yunfeng Ye <[email protected]>
---
tools/perf/builtin-top.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 7c6413447..74239940b 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1280,9 +1280,6 @@ static int __cmd_top(struct perf_top *top)
top->evlist->core.threads, false,
top->nr_threads_synthesize);

- if (top->nr_threads_synthesize > 1)
- perf_set_singlethreaded();
-
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(&perf_env);
if (ret < 0) {
@@ -1359,6 +1356,10 @@ out_join:
out_join_thread:
pthread_cond_signal(&top->qe.cond);
pthread_join(thread_process, NULL);
+
+ if (top->nr_threads_synthesize > 1)
+ perf_set_singlethreaded();
+
return ret;
}

--
2.33.0



2023-04-01 00:21:44

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf top: expand the range of multithreaded phase

Hello,

Sorry for the late reply.

On Fri, Mar 17, 2023 at 3:05 AM Hangliang Lai <[email protected]> wrote:
>
> In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
> donw_read and down_write can work to handle concurrency problems. Then top
> use perf_set_singlethreaded and switch to single threaded phase, assuming
> that no thread concurrency will happen later.
>
> However, a UAF problem could occur in perf top in single threaded phase,
> The concurrent procedure is like this:
>
> display_thread process_thread
> -------------- --------------
>
> thread__comm_len
> -> thread__comm_str
> -> __thread__comm_str(thread)
> thread__delete
> -> comm__free
> -> comm_str__put
> -> zfree(&cs->str)
> -> thread->comm_len = strlen(comm);
>
> Since in single thread phase, perf_singlethreaded is true, down_read and
> down_write can not work to avoid concurrency problems.
>
> This patch put perf_set_singlethreaded to the function tail to expand the
> multithreaded phase range, make display_thread and process_thread run
> safe.

I think it should be unconditional as perf top is always multi-threaded.

>
> Signed-off-by: Hangliang Lai <[email protected]>
> Reported-by: Wenyu Liu <[email protected]>
> Reviewed-by: Yunfeng Ye <[email protected]>
> ---
> tools/perf/builtin-top.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 7c6413447..74239940b 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1280,9 +1280,6 @@ static int __cmd_top(struct perf_top *top)
> top->evlist->core.threads, false,
> top->nr_threads_synthesize);
>
> - if (top->nr_threads_synthesize > 1)
> - perf_set_singlethreaded();

Instead, we can simply do

perf_set_multithreaded();

If top->nr_threads_synthesize > 1, no effect. If not, it turns
the switch on here.

> -
> if (perf_hpp_list.socket) {
> ret = perf_env__read_cpu_topology_map(&perf_env);
> if (ret < 0) {
> @@ -1359,6 +1356,10 @@ out_join:
> out_join_thread:
> pthread_cond_signal(&top->qe.cond);
> pthread_join(thread_process, NULL);
> +
> + if (top->nr_threads_synthesize > 1)
> + perf_set_singlethreaded();

And remove the condition here.

Thanks,
Namhyung


> +
> return ret;
> }
>
> --
> 2.33.0
>

2023-04-06 02:58:11

by Hangliang Lai

[permalink] [raw]
Subject: [PATCH v2] perf top: expand the range of multithreaded phase

In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
donw_read and down_write can work to handle concurrency problems. Then top
use perf_set_singlethreaded and switch to single threaded phase, assuming
that no thread concurrency will happen later.
However, a UAF problem could occur in perf top in single threaded phase,
The concurrent procedure is like this:
display_thread process_thread
-------------- --------------
thread__comm_len
-> thread__comm_str
-> __thread__comm_str(thread)
thread__delete
-> comm__free
-> comm_str__put
-> zfree(&cs->str)
-> thread->comm_len = strlen(comm);
Since in single thread phase, perf_singlethreaded is true, down_read and
down_write can not work to avoid concurrency problems.
This patch put perf_set_singlethreaded to the function tail to expand the
multithreaded phase range, make display_thread and process_thread run
safe.

Signed-off-by: Hangliang Lai <[email protected]>
Reviewed-by: Yunfeng Ye <[email protected]>
---
v1 -> v2
- Since perf top is always multi-threaded, remove top->nr_threads_synthesize judgment.

tools/perf/builtin-top.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d4b5b02bab73..a18db1ee87fa 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1242,8 +1242,7 @@ static int __cmd_top(struct perf_top *top)
if (perf_session__register_idle_thread(top->session) < 0)
return ret;

- if (top->nr_threads_synthesize > 1)
- perf_set_multithreaded();
+ perf_set_multithreaded();

init_process_thread(top);

@@ -1273,9 +1272,6 @@ static int __cmd_top(struct perf_top *top)
top->evlist->core.threads, true, false,
top->nr_threads_synthesize);

- if (top->nr_threads_synthesize > 1)
- perf_set_singlethreaded();
-
if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(&perf_env);
if (ret < 0) {
@@ -1352,6 +1348,9 @@ static int __cmd_top(struct perf_top *top)
out_join_thread:
cond_signal(&top->qe.cond);
pthread_join(thread_process, NULL);
+
+ perf_set_singlethreaded();
+
return ret;
}

--
2.33.0

2023-04-07 21:32:24

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v2] perf top: expand the range of multithreaded phase

Hello,

On Wed, Apr 5, 2023 at 7:54 PM Hangliang Lai <[email protected]> wrote:
>
> In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
> donw_read and down_write can work to handle concurrency problems. Then top
> use perf_set_singlethreaded and switch to single threaded phase, assuming
> that no thread concurrency will happen later.
> However, a UAF problem could occur in perf top in single threaded phase,
> The concurrent procedure is like this:
> display_thread process_thread
> -------------- --------------
> thread__comm_len
> -> thread__comm_str
> -> __thread__comm_str(thread)
> thread__delete
> -> comm__free
> -> comm_str__put
> -> zfree(&cs->str)
> -> thread->comm_len = strlen(comm);
> Since in single thread phase, perf_singlethreaded is true, down_read and
> down_write can not work to avoid concurrency problems.
> This patch put perf_set_singlethreaded to the function tail to expand the
> multithreaded phase range, make display_thread and process_thread run
> safe.
>
> Signed-off-by: Hangliang Lai <[email protected]>
> Reviewed-by: Yunfeng Ye <[email protected]>
> ---
> v1 -> v2
> - Since perf top is always multi-threaded, remove top->nr_threads_synthesize judgment.

Not always, the synthesis can run in a single thread.

>
> tools/perf/builtin-top.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index d4b5b02bab73..a18db1ee87fa 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1242,8 +1242,7 @@ static int __cmd_top(struct perf_top *top)
> if (perf_session__register_idle_thread(top->session) < 0)
> return ret;
>
> - if (top->nr_threads_synthesize > 1)
> - perf_set_multithreaded();
> + perf_set_multithreaded();

I think this part should be kept as is.

>
> init_process_thread(top);
>
> @@ -1273,9 +1272,6 @@ static int __cmd_top(struct perf_top *top)
> top->evlist->core.threads, true, false,
> top->nr_threads_synthesize);
>
> - if (top->nr_threads_synthesize > 1)
> - perf_set_singlethreaded();

Here you can make it multi-threaded unconditionally.

Thanks,
Namhyung

> -
> if (perf_hpp_list.socket) {
> ret = perf_env__read_cpu_topology_map(&perf_env);
> if (ret < 0) {
> @@ -1352,6 +1348,9 @@ static int __cmd_top(struct perf_top *top)
> out_join_thread:
> cond_signal(&top->qe.cond);
> pthread_join(thread_process, NULL);
> +
> + perf_set_singlethreaded();
> +
> return ret;
> }
>
> --
> 2.33.0
>

2023-04-10 03:17:06

by Hangliang Lai

[permalink] [raw]
Subject: Re: [PATCH v2] perf top: expand the range of multithreaded phase

Thanks for your reply Kim ,



On 2023-04-07 21:21 you wrote:



> Not always, the synthesis can run in a single thread.



But I think in machine__synthesize_threads, there are thread_nr threads will be created to do synthesize_threads_worker(tools/perf/util/synthetic-events.c:970)



It’s not a single thread part. So we're supposed to call perf_set_multithreaded() before synthesize?



Thanks,

Hangliang Lai



2023-04-10 03:50:27

by Wenyu Liu

[permalink] [raw]
Subject: Re: [PATCH v2] perf top: expand the range of multithreaded phase

Hello,I think Namhyung means only make it multi-threaded unconditionally after the synthesize

a patch like this:

---
tools/perf/builtin-top.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d4b5b02bab73..60d00975b881 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1273,8 +1273,7 @@ static int __cmd_top(struct perf_top *top)
top->evlist->core.threads, true, false,
top->nr_threads_synthesize);

- if (top->nr_threads_synthesize > 1)
- perf_set_singlethreaded();
+ perf_set_multithreaded();

if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(&perf_env);
--

Right?

Thanks,
Wenyu

在 2023/4/10 10:58, Hangliang Lai 写道:
> Thanks for your reply Kim ,
>
>
>
> On 2023-04-07 21:21 you wrote:
>
>
>
>> Not always, the synthesis can run in a single thread.
>
>
>
> But I think in machine__synthesize_threads, there are thread_nr threads will be created to do synthesize_threads_worker(tools/perf/util/synthetic-events.c:970)
>
>
>
> It’s not a single thread part. So we're supposed to call perf_set_multithreaded() before synthesize?
>
>
>
> Thanks,
>
> Hangliang Lai
>
>
>

2023-04-10 13:42:32

by Hangliang Lai

[permalink] [raw]
Subject: [PATCH v3] perf top: expand the range of multithreaded phase

In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
donw_read and down_write can work to handle concurrency problems. Then top
use perf_set_singlethreaded and switch to single threaded phase, assuming
that no thread concurrency will happen later.
However, a UAF problem could occur in perf top in single threaded phase,
The concurrent procedure is like this:
display_thread process_thread
-------------- --------------
thread__comm_len
-> thread__comm_str
-> __thread__comm_str(thread)
thread__delete
-> comm__free
-> comm_str__put
-> zfree(&cs->str)
-> thread->comm_len = strlen(comm);
Since in single thread phase, perf_singlethreaded is true, down_read and
down_write can not work to avoid concurrency problems.
This patch put perf_set_singlethreaded to the function tail to expand the
multithreaded phase range, make display_thread and process_thread run
safe.

Signed-off-by: Hangliang Lai <[email protected]>
Reviewed-by: Yunfeng Ye <[email protected]>
---
v2 -> v3
- Sorry for my misunderstanding, patch v3 makes perf_set_multithreaded
unconditional after synthesis and set_singlethread in the end.

tools/perf/builtin-top.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d4b5b02bab73..ae96ddaf85c4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1273,8 +1273,7 @@ static int __cmd_top(struct perf_top *top)
top->evlist->core.threads, true, false,
top->nr_threads_synthesize);

- if (top->nr_threads_synthesize > 1)
- perf_set_singlethreaded();
+ perf_set_multithreaded();

if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(&perf_env);
@@ -1352,6 +1351,7 @@ static int __cmd_top(struct perf_top *top)
out_join_thread:
cond_signal(&top->qe.cond);
pthread_join(thread_process, NULL);
+ perf_set_singlethreaded();
return ret;
}

--
2.33.0

2023-04-10 15:43:15

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v3] perf top: expand the range of multithreaded phase

Hello,

On Mon, Apr 10, 2023 at 6:22 AM Hangliang Lai <[email protected]> wrote:
>
> In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
> donw_read and down_write can work to handle concurrency problems. Then top
> use perf_set_singlethreaded and switch to single threaded phase, assuming
> that no thread concurrency will happen later.
> However, a UAF problem could occur in perf top in single threaded phase,
> The concurrent procedure is like this:
> display_thread process_thread
> -------------- --------------
> thread__comm_len
> -> thread__comm_str
> -> __thread__comm_str(thread)
> thread__delete
> -> comm__free
> -> comm_str__put
> -> zfree(&cs->str)
> -> thread->comm_len = strlen(comm);
> Since in single thread phase, perf_singlethreaded is true, down_read and
> down_write can not work to avoid concurrency problems.
> This patch put perf_set_singlethreaded to the function tail to expand the
> multithreaded phase range, make display_thread and process_thread run
> safe.
>
> Signed-off-by: Hangliang Lai <[email protected]>
> Reviewed-by: Yunfeng Ye <[email protected]>

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung


> ---
> v2 -> v3
> - Sorry for my misunderstanding, patch v3 makes perf_set_multithreaded
> unconditional after synthesis and set_singlethread in the end.
>
> tools/perf/builtin-top.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index d4b5b02bab73..ae96ddaf85c4 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1273,8 +1273,7 @@ static int __cmd_top(struct perf_top *top)
> top->evlist->core.threads, true, false,
> top->nr_threads_synthesize);
>
> - if (top->nr_threads_synthesize > 1)
> - perf_set_singlethreaded();
> + perf_set_multithreaded();
>
> if (perf_hpp_list.socket) {
> ret = perf_env__read_cpu_topology_map(&perf_env);
> @@ -1352,6 +1351,7 @@ static int __cmd_top(struct perf_top *top)
> out_join_thread:
> cond_signal(&top->qe.cond);
> pthread_join(thread_process, NULL);
> + perf_set_singlethreaded();
> return ret;
> }
>
> --
> 2.33.0
>

2023-04-11 01:33:05

by Hangliang Lai

[permalink] [raw]
Subject: [PATCH v4] perf top: expand the range of multithreaded phase

In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
donw_read and down_write can work to handle concurrency problems. Then top
use perf_set_singlethreaded and switch to single threaded phase, assuming
that no thread concurrency will happen later.
However, a UAF problem could occur in perf top in single threaded phase,
The concurrent procedure is like this:
display_thread process_thread
-------------- --------------
thread__comm_len
-> thread__comm_str
-> __thread__comm_str(thread)
thread__delete
-> comm__free
-> comm_str__put
-> zfree(&cs->str)
-> thread->comm_len = strlen(comm);
Since in single thread phase, perf_singlethreaded is true, down_read and
down_write can not work to avoid concurrency problems.
This patch put perf_set_singlethreaded to the function tail to expand the
multithreaded phase range, make display_thread and process_thread run
safe.

Signed-off-by: Hangliang Lai <[email protected]>
Co-developed-by: Wenyu Liu <[email protected]>
Reviewed-by: Yunfeng Ye <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
---
v3 -> v4
- Add Acked-by and Co-developed-by.

tools/perf/builtin-top.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index d4b5b02bab73..ae96ddaf85c4 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1273,8 +1273,7 @@ static int __cmd_top(struct perf_top *top)
top->evlist->core.threads, true, false,
top->nr_threads_synthesize);

- if (top->nr_threads_synthesize > 1)
- perf_set_singlethreaded();
+ perf_set_multithreaded();

if (perf_hpp_list.socket) {
ret = perf_env__read_cpu_topology_map(&perf_env);
@@ -1352,6 +1351,7 @@ static int __cmd_top(struct perf_top *top)
out_join_thread:
cond_signal(&top->qe.cond);
pthread_join(thread_process, NULL);
+ perf_set_singlethreaded();
return ret;
}

--
2.33.0

2023-04-12 13:42:22

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4] perf top: expand the range of multithreaded phase

Em Tue, Apr 11, 2023 at 09:32:24AM +0800, Hangliang Lai escreveu:
> In __cmd_top, perf_set_multithreaded is used to enable pthread_rwlock, thus
> donw_read and down_write can work to handle concurrency problems. Then top
> use perf_set_singlethreaded and switch to single threaded phase, assuming
> that no thread concurrency will happen later.
> However, a UAF problem could occur in perf top in single threaded phase,
> The concurrent procedure is like this:
> display_thread process_thread
> -------------- --------------
> thread__comm_len
> -> thread__comm_str
> -> __thread__comm_str(thread)
> thread__delete
> -> comm__free
> -> comm_str__put
> -> zfree(&cs->str)
> -> thread->comm_len = strlen(comm);
> Since in single thread phase, perf_singlethreaded is true, down_read and
> down_write can not work to avoid concurrency problems.
> This patch put perf_set_singlethreaded to the function tail to expand the
> multithreaded phase range, make display_thread and process_thread run
> safe.
>
> Signed-off-by: Hangliang Lai <[email protected]>
> Co-developed-by: Wenyu Liu <[email protected]>
> Reviewed-by: Yunfeng Ye <[email protected]>
> Acked-by: Namhyung Kim <[email protected]>

Thanks, applied.

- Arnaldo


> ---
> v3 -> v4
> - Add Acked-by and Co-developed-by.
>
> tools/perf/builtin-top.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index d4b5b02bab73..ae96ddaf85c4 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -1273,8 +1273,7 @@ static int __cmd_top(struct perf_top *top)
> top->evlist->core.threads, true, false,
> top->nr_threads_synthesize);
>
> - if (top->nr_threads_synthesize > 1)
> - perf_set_singlethreaded();
> + perf_set_multithreaded();
>
> if (perf_hpp_list.socket) {
> ret = perf_env__read_cpu_topology_map(&perf_env);
> @@ -1352,6 +1351,7 @@ static int __cmd_top(struct perf_top *top)
> out_join_thread:
> cond_signal(&top->qe.cond);
> pthread_join(thread_process, NULL);
> + perf_set_singlethreaded();
> return ret;
> }
>
> --
> 2.33.0
>

--

- Arnaldo