2024-06-07 04:57:30

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

Perf bench futex fails as below when attempted to run on
on a powerpc system:

./perf bench futex all
Running futex/hash benchmark...
Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.

perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While blocking the threads with futex_wait, code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/bench/futex-hash.c | 2 +-
tools/perf/bench/futex-lock-pi.c | 2 +-
tools/perf/bench/futex-requeue.c | 2 +-
tools/perf/bench/futex-wake-parallel.c | 2 +-
tools/perf/bench/futex-wake.c | 2 +-
5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 0c69d20efa32..b472eded521b 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
pthread_attr_init(&thread_attr);
gettimeofday(&bench__start, NULL);

- nrcpus = perf_cpu_map__nr(cpu);
+ nrcpus = cpu__max_cpu().cpu;
cpuset = CPU_ALLOC(nrcpus);
BUG_ON(!cpuset);
size = CPU_ALLOC_SIZE(nrcpus);
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 7a4973346180..0416120c091b 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
{
cpu_set_t *cpuset;
unsigned int i;
- int nrcpus = perf_cpu_map__nr(cpu);
+ int nrcpus = cpu__max_cpu().cpu;
size_t size;

threads_starting = params.nthreads;
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index d9ad736c1a3e..aad5bfc4fe18 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
{
cpu_set_t *cpuset;
unsigned int i;
- int nrcpus = perf_cpu_map__nr(cpu);
+ int nrcpus = cpu__max_cpu().cpu;
size_t size;

threads_starting = params.nthreads;
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index b66df553e561..90a5b91bf139 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
{
cpu_set_t *cpuset;
unsigned int i;
- int nrcpus = perf_cpu_map__nr(cpu);
+ int nrcpus = cpu__max_cpu().cpu;
size_t size;

threads_starting = params.nthreads;
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 690fd6d3da13..49b3c89b0b35 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
cpu_set_t *cpuset;
unsigned int i;
size_t size;
- int nrcpus = perf_cpu_map__nr(cpu);
+ int nrcpus = cpu__max_cpu().cpu;
threads_starting = params.nthreads;

cpuset = CPU_ALLOC(nrcpus);
--
2.43.0



2024-06-07 04:57:49

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH 3/3] tools/perf: Fix timing issue with parallel threads in perf bench wake-up-parallel

perf bench futex fails as below and hangs intermittently when
attempted to run on on a powerpc system:

./perf bench futex wake-parallel
Running 'futex/wake-parallel' benchmark:
Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time.

[Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%)
[Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%)
[Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%)
[Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%)
[Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%)
[Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)

In the system, where perf bench wake-up-parallel is has system
configuration of 640 cpus. After debugging, this turned out to be
a timing issue. The benchmark creates threads equal to number of
cpus and issues a futex_wait. Then it does a usleep for .1 second
before initiating futex_wake. In system configuration with more
threads, the usleep time is not enough. Patch changes the usleep
from 100000 to 200000

With the patch, ran multiple iterations and there were no issues
further seen

Reported-by: Disha Goel <[email protected]>
Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/bench/futex-wake-parallel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index 90a5b91bf139..4352e318631e 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -318,7 +318,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
cond_broadcast(&thread_worker);
mutex_unlock(&thread_lock);

- usleep(100000);
+ usleep(200000);

/* Ok, all threads are patiently blocked, start waking folks up */
wakeup_threads(waking_worker);
--
2.43.0


2024-06-07 05:51:29

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH 2/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

Perf bench epoll fails as below when attempted to run on
on a powerpc system:

./perf bench epoll wait
Running 'epoll/wait' benchmark:
Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs.

perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While creating threads and using epoll_wait , code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/bench/epoll-ctl.c | 2 +-
tools/perf/bench/epoll-wait.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c
index d3db73dac66a..d66d852b90e4 100644
--- a/tools/perf/bench/epoll-ctl.c
+++ b/tools/perf/bench/epoll-ctl.c
@@ -232,7 +232,7 @@ static int do_threads(struct worker *worker, struct perf_cpu_map *cpu)
if (!noaffinity)
pthread_attr_init(&thread_attr);

- nrcpus = perf_cpu_map__nr(cpu);
+ nrcpus = cpu__max_cpu().cpu;
cpuset = CPU_ALLOC(nrcpus);
BUG_ON(!cpuset);
size = CPU_ALLOC_SIZE(nrcpus);
diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c
index 06bb3187660a..ef5c4257844d 100644
--- a/tools/perf/bench/epoll-wait.c
+++ b/tools/perf/bench/epoll-wait.c
@@ -309,7 +309,7 @@ static int do_threads(struct worker *worker, struct perf_cpu_map *cpu)
if (!noaffinity)
pthread_attr_init(&thread_attr);

- nrcpus = perf_cpu_map__nr(cpu);
+ nrcpus = cpu__max_cpu().cpu;
cpuset = CPU_ALLOC(nrcpus);
BUG_ON(!cpuset);
size = CPU_ALLOC_SIZE(nrcpus);
--
2.43.0


2024-06-07 17:23:33

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

On Thu, Jun 6, 2024 at 9:44 PM Athira Rajeev
<[email protected]> wrote:
>
> Perf bench futex fails as below when attempted to run on
> on a powerpc system:
>
> ./perf bench futex all
> Running futex/hash benchmark...
> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>
> perf: pthread_create: No such file or directory
>
> In the setup where this perf bench was ran, difference was that
> partition had 640 CPU's, but not all CPUs were online. 80 CPUs
> were online. While blocking the threads with futex_wait, code
> sets the affinity using cpumask. The cpumask size used is 80
> which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
> benchmark reports fail while setting affinity for cpu number which
> is greater than 80 or higher, because it attempts to set a bit
> position which is not allocated on the cpumask. Fix this by changing
> the size of cpumask to number of possible cpus and not the number
> of online cpus.
>
> Signed-off-by: Athira Rajeev <[email protected]>

For the series:
Reviewed-by: Ian Rogers <[email protected]>

Thanks,
Ian

> ---
> tools/perf/bench/futex-hash.c | 2 +-
> tools/perf/bench/futex-lock-pi.c | 2 +-
> tools/perf/bench/futex-requeue.c | 2 +-
> tools/perf/bench/futex-wake-parallel.c | 2 +-
> tools/perf/bench/futex-wake.c | 2 +-
> 5 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
> index 0c69d20efa32..b472eded521b 100644
> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
> pthread_attr_init(&thread_attr);
> gettimeofday(&bench__start, NULL);
>
> - nrcpus = perf_cpu_map__nr(cpu);
> + nrcpus = cpu__max_cpu().cpu;
> cpuset = CPU_ALLOC(nrcpus);
> BUG_ON(!cpuset);
> size = CPU_ALLOC_SIZE(nrcpus);
> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
> index 7a4973346180..0416120c091b 100644
> --- a/tools/perf/bench/futex-lock-pi.c
> +++ b/tools/perf/bench/futex-lock-pi.c
> @@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
> index d9ad736c1a3e..aad5bfc4fe18 100644
> --- a/tools/perf/bench/futex-requeue.c
> +++ b/tools/perf/bench/futex-requeue.c
> @@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
> index b66df553e561..90a5b91bf139 100644
> --- a/tools/perf/bench/futex-wake-parallel.c
> +++ b/tools/perf/bench/futex-wake-parallel.c
> @@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
> index 690fd6d3da13..49b3c89b0b35 100644
> --- a/tools/perf/bench/futex-wake.c
> +++ b/tools/perf/bench/futex-wake.c
> @@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> cpu_set_t *cpuset;
> unsigned int i;
> size_t size;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> threads_starting = params.nthreads;
>
> cpuset = CPU_ALLOC(nrcpus);
> --
> 2.43.0
>

2024-06-08 12:16:25

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online



> On 7 Jun 2024, at 10:53 PM, Ian Rogers <[email protected]> wrote:
>
> On Thu, Jun 6, 2024 at 9:44 PM Athira Rajeev
> <[email protected]> wrote:
>>
>> Perf bench futex fails as below when attempted to run on
>> on a powerpc system:
>>
>> ./perf bench futex all
>> Running futex/hash benchmark...
>> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>>
>> perf: pthread_create: No such file or directory
>>
>> In the setup where this perf bench was ran, difference was that
>> partition had 640 CPU's, but not all CPUs were online. 80 CPUs
>> were online. While blocking the threads with futex_wait, code
>> sets the affinity using cpumask. The cpumask size used is 80
>> which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
>> benchmark reports fail while setting affinity for cpu number which
>> is greater than 80 or higher, because it attempts to set a bit
>> position which is not allocated on the cpumask. Fix this by changing
>> the size of cpumask to number of possible cpus and not the number
>> of online cpus.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>
> For the series:
> Reviewed-by: Ian Rogers <[email protected]>

Hi Ian

Thanks for the review

Athira
>
> Thanks,
> Ian
>
>> ---
>> tools/perf/bench/futex-hash.c | 2 +-
>> tools/perf/bench/futex-lock-pi.c | 2 +-
>> tools/perf/bench/futex-requeue.c | 2 +-
>> tools/perf/bench/futex-wake-parallel.c | 2 +-
>> tools/perf/bench/futex-wake.c | 2 +-
>> 5 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
>> index 0c69d20efa32..b472eded521b 100644
>> --- a/tools/perf/bench/futex-hash.c
>> +++ b/tools/perf/bench/futex-hash.c
>> @@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
>> pthread_attr_init(&thread_attr);
>> gettimeofday(&bench__start, NULL);
>>
>> - nrcpus = perf_cpu_map__nr(cpu);
>> + nrcpus = cpu__max_cpu().cpu;
>> cpuset = CPU_ALLOC(nrcpus);
>> BUG_ON(!cpuset);
>> size = CPU_ALLOC_SIZE(nrcpus);
>> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
>> index 7a4973346180..0416120c091b 100644
>> --- a/tools/perf/bench/futex-lock-pi.c
>> +++ b/tools/perf/bench/futex-lock-pi.c
>> @@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>>
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
>> index d9ad736c1a3e..aad5bfc4fe18 100644
>> --- a/tools/perf/bench/futex-requeue.c
>> +++ b/tools/perf/bench/futex-requeue.c
>> @@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>>
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
>> index b66df553e561..90a5b91bf139 100644
>> --- a/tools/perf/bench/futex-wake-parallel.c
>> +++ b/tools/perf/bench/futex-wake-parallel.c
>> @@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>>
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
>> index 690fd6d3da13..49b3c89b0b35 100644
>> --- a/tools/perf/bench/futex-wake.c
>> +++ b/tools/perf/bench/futex-wake.c
>> @@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> cpu_set_t *cpuset;
>> unsigned int i;
>> size_t size;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> threads_starting = params.nthreads;
>>
>> cpuset = CPU_ALLOC(nrcpus);
>> --
>> 2.43.0



2024-06-10 14:31:29

by Disha Goel

[permalink] [raw]
Subject: Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

On 07/06/24 10:13 am, Athira Rajeev wrote:

> Perf bench futex fails as below when attempted to run on
> on a powerpc system:
>
> ./perf bench futex all
> Running futex/hash benchmark...
> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>
> perf: pthread_create: No such file or directory
>
> In the setup where this perf bench was ran, difference was that
> partition had 640 CPU's, but not all CPUs were online. 80 CPUs
> were online. While blocking the threads with futex_wait, code
> sets the affinity using cpumask. The cpumask size used is 80
> which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
> benchmark reports fail while setting affinity for cpu number which
> is greater than 80 or higher, because it attempts to set a bit
> position which is not allocated on the cpumask. Fix this by changing
> the size of cpumask to number of possible cpus and not the number
> of online cpus.
>
> Signed-off-by: Athira Rajeev <[email protected]>

Thanks for the fix patches, Athira.
I have tested all three patches on a power machine (both small and max config),
and the perf bench futex and epoll tests run fine.

For the series:
Tested-by: Disha Goel <[email protected]>

> ---
> tools/perf/bench/futex-hash.c | 2 +-
> tools/perf/bench/futex-lock-pi.c | 2 +-
> tools/perf/bench/futex-requeue.c | 2 +-
> tools/perf/bench/futex-wake-parallel.c | 2 +-
> tools/perf/bench/futex-wake.c | 2 +-
> 5 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
> index 0c69d20efa32..b472eded521b 100644
> --- a/tools/perf/bench/futex-hash.c
> +++ b/tools/perf/bench/futex-hash.c
> @@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
> pthread_attr_init(&thread_attr);
> gettimeofday(&bench__start, NULL);
>
> - nrcpus = perf_cpu_map__nr(cpu);
> + nrcpus = cpu__max_cpu().cpu;
> cpuset = CPU_ALLOC(nrcpus);
> BUG_ON(!cpuset);
> size = CPU_ALLOC_SIZE(nrcpus);
> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
> index 7a4973346180..0416120c091b 100644
> --- a/tools/perf/bench/futex-lock-pi.c
> +++ b/tools/perf/bench/futex-lock-pi.c
> @@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
> index d9ad736c1a3e..aad5bfc4fe18 100644
> --- a/tools/perf/bench/futex-requeue.c
> +++ b/tools/perf/bench/futex-requeue.c
> @@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
> index b66df553e561..90a5b91bf139 100644
> --- a/tools/perf/bench/futex-wake-parallel.c
> +++ b/tools/perf/bench/futex-wake-parallel.c
> @@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> {
> cpu_set_t *cpuset;
> unsigned int i;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> size_t size;
>
> threads_starting = params.nthreads;
> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
> index 690fd6d3da13..49b3c89b0b35 100644
> --- a/tools/perf/bench/futex-wake.c
> +++ b/tools/perf/bench/futex-wake.c
> @@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
> cpu_set_t *cpuset;
> unsigned int i;
> size_t size;
> - int nrcpus = perf_cpu_map__nr(cpu);
> + int nrcpus = cpu__max_cpu().cpu;
> threads_starting = params.nthreads;
>
> cpuset = CPU_ALLOC(nrcpus);

2024-06-12 12:12:12

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online



> On 10 Jun 2024, at 7:52 PM, Disha Goel <[email protected]> wrote:
>
> On 07/06/24 10:13 am, Athira Rajeev wrote:
>
>> Perf bench futex fails as below when attempted to run on
>> on a powerpc system:
>>
>> ./perf bench futex all
>> Running futex/hash benchmark...
>> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>>
>> perf: pthread_create: No such file or directory
>>
>> In the setup where this perf bench was ran, difference was that
>> partition had 640 CPU's, but not all CPUs were online. 80 CPUs
>> were online. While blocking the threads with futex_wait, code
>> sets the affinity using cpumask. The cpumask size used is 80
>> which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
>> benchmark reports fail while setting affinity for cpu number which
>> is greater than 80 or higher, because it attempts to set a bit
>> position which is not allocated on the cpumask. Fix this by changing
>> the size of cpumask to number of possible cpus and not the number
>> of online cpus.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>
> Thanks for the fix patches, Athira.
> I have tested all three patches on a power machine (both small and max config),
> and the perf bench futex and epoll tests run fine.
>
> For the series:
> Tested-by: Disha Goel <[email protected]>

Thanks Disha for testing the patchset.

Athira
>
>> ---
>> tools/perf/bench/futex-hash.c | 2 +-
>> tools/perf/bench/futex-lock-pi.c | 2 +-
>> tools/perf/bench/futex-requeue.c | 2 +-
>> tools/perf/bench/futex-wake-parallel.c | 2 +-
>> tools/perf/bench/futex-wake.c | 2 +-
>> 5 files changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
>> index 0c69d20efa32..b472eded521b 100644
>> --- a/tools/perf/bench/futex-hash.c
>> +++ b/tools/perf/bench/futex-hash.c
>> @@ -174,7 +174,7 @@ int bench_futex_hash(int argc, const char **argv)
>> pthread_attr_init(&thread_attr);
>> gettimeofday(&bench__start, NULL);
>> - nrcpus = perf_cpu_map__nr(cpu);
>> + nrcpus = cpu__max_cpu().cpu;
>> cpuset = CPU_ALLOC(nrcpus);
>> BUG_ON(!cpuset);
>> size = CPU_ALLOC_SIZE(nrcpus);
>> diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
>> index 7a4973346180..0416120c091b 100644
>> --- a/tools/perf/bench/futex-lock-pi.c
>> +++ b/tools/perf/bench/futex-lock-pi.c
>> @@ -122,7 +122,7 @@ static void create_threads(struct worker *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
>> index d9ad736c1a3e..aad5bfc4fe18 100644
>> --- a/tools/perf/bench/futex-requeue.c
>> +++ b/tools/perf/bench/futex-requeue.c
>> @@ -125,7 +125,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
>> index b66df553e561..90a5b91bf139 100644
>> --- a/tools/perf/bench/futex-wake-parallel.c
>> +++ b/tools/perf/bench/futex-wake-parallel.c
>> @@ -149,7 +149,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> {
>> cpu_set_t *cpuset;
>> unsigned int i;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> size_t size;
>> threads_starting = params.nthreads;
>> diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
>> index 690fd6d3da13..49b3c89b0b35 100644
>> --- a/tools/perf/bench/futex-wake.c
>> +++ b/tools/perf/bench/futex-wake.c
>> @@ -100,7 +100,7 @@ static void block_threads(pthread_t *w, struct perf_cpu_map *cpu)
>> cpu_set_t *cpuset;
>> unsigned int i;
>> size_t size;
>> - int nrcpus = perf_cpu_map__nr(cpu);
>> + int nrcpus = cpu__max_cpu().cpu;
>> threads_starting = params.nthreads;
>> cpuset = CPU_ALLOC(nrcpus);



2024-06-14 13:45:50

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/3] tools/perf: Fix the nrcpus in perf bench futex to enable the run when all CPU's are not online

On Fri, 07 Jun 2024 10:13:52 +0530, Athira Rajeev wrote:

> Perf bench futex fails as below when attempted to run on
> on a powerpc system:
>
> ./perf bench futex all
> Running futex/hash benchmark...
> Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.
>
> [...]

Applied to perf-tools-next after updating the commit log a bit, thanks!

Best regards,
Namhyung