2022-09-06 03:28:31

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH 2/3] perf bench syscall: Add close syscall benchmark

This commit adds a simple close syscall benchmark, more syscall
benchmarks can be added in the future.

Here are the test results:

[loongson@linux perf]$ ./perf bench syscall

# List of available benchmarks for collection 'syscall':

basic: Benchmark for basic getppid(2) calls
close: Benchmark for close(2) calls
all: Run all syscall benchmarks

[loongson@linux perf]$ ./perf bench syscall basic
# Running 'syscall/basic' benchmark:
# Executed 10000000 getppid() calls
Total time: 1.956 [sec]

0.195687 usecs/op
5110201 ops/sec
[loongson@linux perf]$ ./perf bench syscall close
# Running 'syscall/close' benchmark:
# Executed 10000000 close() calls
Total time: 6.302 [sec]

0.630297 usecs/op
1586553 ops/sec
[loongson@linux perf]$ ./perf bench syscall all
# Running syscall/basic benchmark...
# Executed 10000000 getppid() calls
Total time: 1.956 [sec]

0.195686 usecs/op
5110232 ops/sec

# Running syscall/close benchmark...
# Executed 10000000 close() calls
Total time: 6.302 [sec]

0.630271 usecs/op
1586619 ops/sec

Signed-off-by: Tiezhu Yang <[email protected]>
---
tools/perf/bench/bench.h | 1 +
tools/perf/bench/syscall.c | 11 +++++++++++
tools/perf/builtin-bench.c | 1 +
3 files changed, 13 insertions(+)

diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 6cefb43..916cd47 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -34,6 +34,7 @@ int bench_numa(int argc, const char **argv);
int bench_sched_messaging(int argc, const char **argv);
int bench_sched_pipe(int argc, const char **argv);
int bench_syscall_basic(int argc, const char **argv);
+int bench_syscall_close(int argc, const char **argv);
int bench_mem_memcpy(int argc, const char **argv);
int bench_mem_memset(int argc, const char **argv);
int bench_mem_find_bit(int argc, const char **argv);
diff --git a/tools/perf/bench/syscall.c b/tools/perf/bench/syscall.c
index 746fd71..058394b 100644
--- a/tools/perf/bench/syscall.c
+++ b/tools/perf/bench/syscall.c
@@ -46,6 +46,9 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
case __NR_getppid:
getppid();
break;
+ case __NR_close:
+ close(dup(0));
+ break;
default:
break;
}
@@ -58,6 +61,9 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
case __NR_getppid:
name = "getppid()";
break;
+ case __NR_close:
+ name = "close()";
+ break;
default:
break;
}
@@ -100,3 +106,8 @@ int bench_syscall_basic(int argc, const char **argv)
{
return bench_syscall_common(argc, argv, __NR_getppid);
}
+
+int bench_syscall_close(int argc, const char **argv)
+{
+ return bench_syscall_common(argc, argv, __NR_close);
+}
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 334ab89..b63c711 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -52,6 +52,7 @@ static struct bench sched_benchmarks[] = {

static struct bench syscall_benchmarks[] = {
{ "basic", "Benchmark for basic getppid(2) calls", bench_syscall_basic },
+ { "close", "Benchmark for close(2) calls", bench_syscall_close },
{ "all", "Run all syscall benchmarks", NULL },
{ NULL, NULL, NULL },
};
--
2.1.0


2022-09-06 03:37:47

by David Laight

[permalink] [raw]
Subject: RE: [PATCH 2/3] perf bench syscall: Add close syscall benchmark

From: Tiezhu Yang
> Sent: 06 September 2022 04:06
>
> This commit adds a simple close syscall benchmark, more syscall
> benchmarks can be added in the future.
>
...
>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> tools/perf/bench/bench.h | 1 +
> tools/perf/bench/syscall.c | 11 +++++++++++
> tools/perf/builtin-bench.c | 1 +
> 3 files changed, 13 insertions(+)
>
> diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
> index 6cefb43..916cd47 100644
...
> diff --git a/tools/perf/bench/syscall.c b/tools/perf/bench/syscall.c
> index 746fd71..058394b 100644
> --- a/tools/perf/bench/syscall.c
> +++ b/tools/perf/bench/syscall.c
> @@ -46,6 +46,9 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
> case __NR_getppid:
> getppid();
> break;
> + case __NR_close:
> + close(dup(0));

Not really a close() test.
The dup(0) call will be significant and may take longer.

I'm also not sure that using the syscall number for the
test number is entirely sensible.

One thing I have measured in the past is the time taken
to read in an iov[] array.
This can be measured quite nicely using writev() on /dev/null.
(No copies ever happen and iov_iter() is never used.)
But you need to test a few different iov lengths.

I'm also not 100% sure how accurate/repeatable/sensible it
is to use the 'wall clock time' for 1000000 iterations.
A lot of modern cpu will dynamically change the clock speed
underneath you and other system code (like ethernet receive)
can badly perturb the results.

What you really want to use is a TSC - but they are now
useless for counting cycles.
The x86 performance counters to have a cycle counter.
I've used that to measure single calls of both library
functions and system calls.
Just 10 iterations give a 'cold cache' value and some
very consistent counts (remove real outliers).
Indeed the fastest value is really the right one.

For functions like the IP checksum you can even
show that the code is executing in the expected number
of clock cycles (usually limited by memory reads).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2022-09-06 04:49:54

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH 2/3] perf bench syscall: Add close syscall benchmark



On 09/06/2022 11:30 AM, David Laight wrote:
> From: Tiezhu Yang
>> Sent: 06 September 2022 04:06
>>
>> This commit adds a simple close syscall benchmark, more syscall
>> benchmarks can be added in the future.
>>
> ...
>>
>> Signed-off-by: Tiezhu Yang <[email protected]>
>> ---
>> tools/perf/bench/bench.h | 1 +
>> tools/perf/bench/syscall.c | 11 +++++++++++
>> tools/perf/builtin-bench.c | 1 +
>> 3 files changed, 13 insertions(+)
>>
>> diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
>> index 6cefb43..916cd47 100644
> ...
>> diff --git a/tools/perf/bench/syscall.c b/tools/perf/bench/syscall.c
>> index 746fd71..058394b 100644
>> --- a/tools/perf/bench/syscall.c
>> +++ b/tools/perf/bench/syscall.c
>> @@ -46,6 +46,9 @@ static int bench_syscall_common(int argc, const char **argv, int syscall)
>> case __NR_getppid:
>> getppid();
>> break;
>> + case __NR_close:
>> + close(dup(0));
>
> Not really a close() test.
> The dup(0) call will be significant and may take longer.
>
> I'm also not sure that using the syscall number for the
> test number is entirely sensible.
>
> One thing I have measured in the past is the time taken
> to read in an iov[] array.
> This can be measured quite nicely using writev() on /dev/null.
> (No copies ever happen and iov_iter() is never used.)
> But you need to test a few different iov lengths.
>
> I'm also not 100% sure how accurate/repeatable/sensible it
> is to use the 'wall clock time' for 1000000 iterations.
> A lot of modern cpu will dynamically change the clock speed
> underneath you and other system code (like ethernet receive)
> can badly perturb the results.
>
> What you really want to use is a TSC - but they are now
> useless for counting cycles.
> The x86 performance counters to have a cycle counter.
> I've used that to measure single calls of both library
> functions and system calls.
> Just 10 iterations give a 'cold cache' value and some
> very consistent counts (remove real outliers).
> Indeed the fastest value is really the right one.
>
> For functions like the IP checksum you can even
> show that the code is executing in the expected number
> of clock cycles (usually limited by memory reads).
>
> David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>

Hi David,

Thanks for your reply.

There are some explanations in commit c2a08203052f ("perf bench:
Add basic syscall benchmark"), I presume the current benchmark
framework works well, if not, maybe we should modify the framework
first.

The initial aim of this patch series is to benchmark more syscalls,
some code is similar with the UnixBench syscall test [1].

[1]
https://github.com/kdlucas/byte-unixbench/blob/master/UnixBench/src/syscall.c

Thanks,
Tiezhu