2020-09-02 13:45:08

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 0/6] Perf tool: Support TSC for Arm64

This patch set is to refactor the changes for the old patch set 'Perf
tool: Enable Arm arch timer counter and arm-spe's timestamp' [1].

After reviewed the old patch sets (thanks Wei Li and Al Grant), we
think it's right way to consolidate TSC code and extend the TSC
implementation to common code which can support both x86 and Arm64;
so far, for x86 it needs to support cap_user_time_zero and for Arm64
it needs to support cap_user_time_short. For architecture specific
code, every arch only needs to implement its own rdtsc() to read out
timer's counter.

This patch set is to refactor TSC implementation and move TSC code from
x86 folder to util/tsc.c, this allows all archs to reuse the code. And
alse move the TSC testing from x86 folder to tests so can work as a
common testing.

This patch set has been tested on Arm64 (DB410c) and x86_64. Both can
pass the testing:

$ perf test list
[...]
68: Convert perf time to TSC
[...]

$ perf test 68 -v
68: Convert perf time to TSC
--- start ---
test child forked, pid 10961
mmap size 528384B
1st event perf time 35715036563417 tsc 686221770989
rdtsc time 35715036649719 tsc 686221772647
2nd event perf time 35715036660448 tsc 686221772852
test child finished with 0
---- end ----
Convert perf time to TSC: Ok

Changes from v2:
* Refactored patch set to move TSC common code to util/tsc.c (Wei/Al);
* Moved TSC testing to perf/tests (Wei);
* Dropped Arm SPE timestamp patch so can have clear purpose and easier
reviewing; will send Arm SPE timestamp as separate patch.

[1] https://lore.kernel.org/patchwork/cover/1285130/

Leo Yan (6):
perf tsc: Move out common functions from x86
perf tsc: Add rdtsc() for Arm64
perf tsc: Calculate timestamp with cap_user_time_short
perf tsc: Support cap_user_time_short for event TIME_CONV
perf tests tsc: Make tsc testing as a common testing
perf tests tsc: Add checking helper is_supported()

tools/lib/perf/include/perf/event.h | 4 +
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/tsc.c | 14 ++++
tools/perf/arch/x86/include/arch-tests.h | 1 -
tools/perf/arch/x86/tests/Build | 1 -
tools/perf/arch/x86/tests/arch-tests.c | 4 -
tools/perf/arch/x86/util/tsc.c | 73 +----------------
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 5 ++
.../{arch/x86 => }/tests/perf-time-to-tsc.c | 13 +++
tools/perf/tests/tests.h | 2 +
tools/perf/util/jitdump.c | 14 ++--
tools/perf/util/synthetic-events.c | 8 --
tools/perf/util/tsc.c | 81 +++++++++++++++++++
tools/perf/util/tsc.h | 5 ++
15 files changed, 136 insertions(+), 91 deletions(-)
create mode 100644 tools/perf/arch/arm64/util/tsc.c
rename tools/perf/{arch/x86 => }/tests/perf-time-to-tsc.c (93%)

--
2.17.1


2020-09-02 13:45:36

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 1/6] perf tsc: Move out common functions from x86

Functions perf_read_tsc_conversion() and perf_event__synth_time_conv()
should work as common functions rather than x86 specific, so move these
two functions out from arch/x86 folder and place them into util/tsc.c.

Since the function perf_event__synth_time_conv() will be linked in
util/tsc.c, remove its weak version.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/arch/x86/util/tsc.c | 73 +-----------------------------
tools/perf/util/synthetic-events.c | 8 ----
tools/perf/util/tsc.c | 71 +++++++++++++++++++++++++++++
3 files changed, 72 insertions(+), 80 deletions(-)

diff --git a/tools/perf/arch/x86/util/tsc.c b/tools/perf/arch/x86/util/tsc.c
index 2f55afb14e1f..559365f8fe52 100644
--- a/tools/perf/arch/x86/util/tsc.c
+++ b/tools/perf/arch/x86/util/tsc.c
@@ -1,45 +1,7 @@
// SPDX-License-Identifier: GPL-2.0
-#include <stdbool.h>
-#include <errno.h>
-
-#include <linux/stddef.h>
-#include <linux/perf_event.h>
-
#include <linux/types.h>
-#include <asm/barrier.h>
-#include "../../../util/debug.h"
-#include "../../../util/event.h"
-#include "../../../util/synthetic-events.h"
-#include "../../../util/tsc.h"
-
-int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
- struct perf_tsc_conversion *tc)
-{
- bool cap_user_time_zero;
- u32 seq;
- int i = 0;
-
- while (1) {
- seq = pc->lock;
- rmb();
- tc->time_mult = pc->time_mult;
- tc->time_shift = pc->time_shift;
- tc->time_zero = pc->time_zero;
- cap_user_time_zero = pc->cap_user_time_zero;
- rmb();
- if (pc->lock == seq && !(seq & 1))
- break;
- if (++i > 10000) {
- pr_debug("failed to get perf_event_mmap_page lock\n");
- return -EINVAL;
- }
- }

- if (!cap_user_time_zero)
- return -EOPNOTSUPP;
-
- return 0;
-}
+#include "../../../util/tsc.h"

u64 rdtsc(void)
{
@@ -49,36 +11,3 @@ u64 rdtsc(void)

return low | ((u64)high) << 32;
}
-
-int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
- struct perf_tool *tool,
- perf_event__handler_t process,
- struct machine *machine)
-{
- union perf_event event = {
- .time_conv = {
- .header = {
- .type = PERF_RECORD_TIME_CONV,
- .size = sizeof(struct perf_record_time_conv),
- },
- },
- };
- struct perf_tsc_conversion tc;
- int err;
-
- if (!pc)
- return 0;
- err = perf_read_tsc_conversion(pc, &tc);
- if (err == -EOPNOTSUPP)
- return 0;
- if (err)
- return err;
-
- pr_debug2("Synthesizing TSC conversion information\n");
-
- event.time_conv.time_mult = tc.time_mult;
- event.time_conv.time_shift = tc.time_shift;
- event.time_conv.time_zero = tc.time_zero;
-
- return process(tool, &event, NULL, machine);
-}
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 89b390623b63..3ca5d9399680 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -2006,14 +2006,6 @@ int perf_event__synthesize_stat_events(struct perf_stat_config *config, struct p
return 0;
}

-int __weak perf_event__synth_time_conv(const struct perf_event_mmap_page *pc __maybe_unused,
- struct perf_tool *tool __maybe_unused,
- perf_event__handler_t process __maybe_unused,
- struct machine *machine __maybe_unused)
-{
- return 0;
-}
-
extern const struct perf_header_feature_ops feat_ops[HEADER_LAST_FEATURE];

int perf_event__synthesize_features(struct perf_tool *tool, struct perf_session *session,
diff --git a/tools/perf/util/tsc.c b/tools/perf/util/tsc.c
index bfa782421cbd..9e3f04ddddf8 100644
--- a/tools/perf/util/tsc.c
+++ b/tools/perf/util/tsc.c
@@ -1,7 +1,16 @@
// SPDX-License-Identifier: GPL-2.0
+#include <errno.h>
+
#include <linux/compiler.h>
+#include <linux/perf_event.h>
+#include <linux/stddef.h>
#include <linux/types.h>

+#include <asm/barrier.h>
+
+#include "event.h"
+#include "synthetic-events.h"
+#include "debug.h"
#include "tsc.h"

u64 perf_time_to_tsc(u64 ns, struct perf_tsc_conversion *tc)
@@ -25,6 +34,68 @@ u64 tsc_to_perf_time(u64 cyc, struct perf_tsc_conversion *tc)
((rem * tc->time_mult) >> tc->time_shift);
}

+int perf_read_tsc_conversion(const struct perf_event_mmap_page *pc,
+ struct perf_tsc_conversion *tc)
+{
+ bool cap_user_time_zero;
+ u32 seq;
+ int i = 0;
+
+ while (1) {
+ seq = pc->lock;
+ rmb();
+ tc->time_mult = pc->time_mult;
+ tc->time_shift = pc->time_shift;
+ tc->time_zero = pc->time_zero;
+ cap_user_time_zero = pc->cap_user_time_zero;
+ rmb();
+ if (pc->lock == seq && !(seq & 1))
+ break;
+ if (++i > 10000) {
+ pr_debug("failed to get perf_event_mmap_page lock\n");
+ return -EINVAL;
+ }
+ }
+
+ if (!cap_user_time_zero)
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
+ struct perf_tool *tool,
+ perf_event__handler_t process,
+ struct machine *machine)
+{
+ union perf_event event = {
+ .time_conv = {
+ .header = {
+ .type = PERF_RECORD_TIME_CONV,
+ .size = sizeof(struct perf_record_time_conv),
+ },
+ },
+ };
+ struct perf_tsc_conversion tc;
+ int err;
+
+ if (!pc)
+ return 0;
+ err = perf_read_tsc_conversion(pc, &tc);
+ if (err == -EOPNOTSUPP)
+ return 0;
+ if (err)
+ return err;
+
+ pr_debug2("Synthesizing TSC conversion information\n");
+
+ event.time_conv.time_mult = tc.time_mult;
+ event.time_conv.time_shift = tc.time_shift;
+ event.time_conv.time_zero = tc.time_zero;
+
+ return process(tool, &event, NULL, machine);
+}
+
u64 __weak rdtsc(void)
{
return 0;
--
2.17.1

2020-09-02 13:51:57

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 2/6] perf tsc: Add rdtsc() for Arm64

The system register CNTVCT_EL0 can be used to retrieve the counter from
user space. Add rdtsc() for Arm64.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/tsc.c | 14 ++++++++++++++
2 files changed, 15 insertions(+)
create mode 100644 tools/perf/arch/arm64/util/tsc.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index 5c13438c7bd4..b53294d74b01 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,6 +1,7 @@
perf-y += header.o
perf-y += machine.o
perf-y += perf_regs.o
+perf-y += tsc.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
diff --git a/tools/perf/arch/arm64/util/tsc.c b/tools/perf/arch/arm64/util/tsc.c
new file mode 100644
index 000000000000..53c6adf8ea6e
--- /dev/null
+++ b/tools/perf/arch/arm64/util/tsc.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/types.h>
+
+#include "../../../util/tsc.h"
+
+u64 rdtsc(void)
+{
+ u64 val;
+
+ asm volatile("mrs %0, cntvct_el0" : "=r" (val));
+
+ return val;
+}
--
2.17.1

2020-09-02 14:08:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] perf tsc: Add rdtsc() for Arm64

On Wed, Sep 02, 2020 at 02:21:27PM +0100, Leo Yan wrote:
> The system register CNTVCT_EL0 can be used to retrieve the counter from
> user space. Add rdtsc() for Arm64.

> +u64 rdtsc(void)
> +{
> + u64 val;

Would it make sense to put a comment in that this counter is/could-be
'short' ? Because unlike x86-TSC, this thing isn't architecturally
specified to be 64bits wide.

> + asm volatile("mrs %0, cntvct_el0" : "=r" (val));
> +
> + return val;
> +}
> --
> 2.17.1
>

2020-09-02 15:47:43

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 4/6] perf tsc: Support cap_user_time_short for event TIME_CONV

The synthesized event TIME_CONV doesn't contain the complete parameters
for counters, this will lead to wrong conversion between counter cycles
and timestamp.

This patch extends event TIME_CONV to record flags 'cap_user_time_zero'
which is used to indicate the counter parameters are valid or not, if
not will directly return 0 for timestamp calculation. And record the
flag 'cap_user_time_short' and its relevant fields 'time_cycles' and
'time_mask' for cycle calibration.

Signed-off-by: Leo Yan <[email protected]>
---
tools/lib/perf/include/perf/event.h | 4 ++++
tools/perf/util/jitdump.c | 14 +++++++++-----
tools/perf/util/tsc.c | 4 ++++
3 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 842028858d66..a6dbba6b9073 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -324,6 +324,10 @@ struct perf_record_time_conv {
__u64 time_shift;
__u64 time_mult;
__u64 time_zero;
+ __u64 time_cycles;
+ __u64 time_mask;
+ bool cap_user_time_zero;
+ bool cap_user_time_short;
};

struct perf_record_header_feature {
diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 0804308ef285..055bab7a92b3 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -374,11 +374,15 @@ static uint64_t convert_timestamp(struct jit_buf_desc *jd, uint64_t timestamp)
if (!jd->use_arch_timestamp)
return timestamp;

- tc.time_shift = jd->session->time_conv.time_shift;
- tc.time_mult = jd->session->time_conv.time_mult;
- tc.time_zero = jd->session->time_conv.time_zero;
-
- if (!tc.time_mult)
+ tc.time_shift = jd->session->time_conv.time_shift;
+ tc.time_mult = jd->session->time_conv.time_mult;
+ tc.time_zero = jd->session->time_conv.time_zero;
+ tc.time_cycles = jd->session->time_conv.time_cycles;
+ tc.time_mask = jd->session->time_conv.time_mask;
+ tc.cap_user_time_zero = jd->session->time_conv.cap_user_time_zero;
+ tc.cap_user_time_short = jd->session->time_conv.cap_user_time_short;
+
+ if (!tc.cap_user_time_zero)
return 0;

return tsc_to_perf_time(timestamp, &tc);
diff --git a/tools/perf/util/tsc.c b/tools/perf/util/tsc.c
index c0ca40204649..62b4c75c966c 100644
--- a/tools/perf/util/tsc.c
+++ b/tools/perf/util/tsc.c
@@ -98,6 +98,10 @@ int perf_event__synth_time_conv(const struct perf_event_mmap_page *pc,
event.time_conv.time_mult = tc.time_mult;
event.time_conv.time_shift = tc.time_shift;
event.time_conv.time_zero = tc.time_zero;
+ event.time_conv.time_cycles = tc.time_cycles;
+ event.time_conv.time_mask = tc.time_mask;
+ event.time_conv.cap_user_time_zero = tc.cap_user_time_zero;
+ event.time_conv.cap_user_time_short = tc.cap_user_time_short;

return process(tool, &event, NULL, machine);
}
--
2.17.1

2020-09-02 15:48:09

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 6/6] perf tests tsc: Add checking helper is_supported()

So far tsc is enabled on x86_64, i386 and Arm64 architectures, add
checking helper to skip this testing for other architectures.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/perf-time-to-tsc.c | 13 +++++++++++++
tools/perf/tests/tests.h | 1 +
3 files changed, 15 insertions(+)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 67e6a1c6c793..4b57ea79d3e7 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -344,6 +344,7 @@ static struct test generic_tests[] = {
{
.desc = "Convert perf time to TSC",
.func = test__perf_time_to_tsc,
+ .is_supported = test__tsc_is_supported,
},
{
.func = NULL,
diff --git a/tools/perf/tests/perf-time-to-tsc.c b/tools/perf/tests/perf-time-to-tsc.c
index 026d32ed078e..41dadd4cd097 100644
--- a/tools/perf/tests/perf-time-to-tsc.c
+++ b/tools/perf/tests/perf-time-to-tsc.c
@@ -171,3 +171,16 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
evlist__delete(evlist);
return err;
}
+
+bool test__tsc_is_supported(void)
+{
+ /*
+ * Except x86_64/i386 and Arm64, other archs don't support TSC in perf.
+ * Just enable the test for x86_64/i386 and Arm64 archs.
+ */
+#if defined(__x86_64__) || defined(__i386__) || defined(__aarch64__)
+ return true;
+#else
+ return false;
+#endif
+}
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 1633f54d6156..86466a518d8e 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -127,6 +127,7 @@ int test__perf_time_to_tsc(struct test *test, int subtest);
bool test__bp_signal_is_supported(void);
bool test__bp_account_is_supported(void);
bool test__wp_is_supported(void);
+bool test__tsc_is_supported(void);

#if defined(__arm__) || defined(__aarch64__)
#ifdef HAVE_DWARF_UNWIND_SUPPORT
--
2.17.1

2020-09-02 15:50:14

by Leo Yan

[permalink] [raw]
Subject: [PATCH v3 5/6] perf tests tsc: Make tsc testing as a common testing

x86 arch provides the testing for conversion between tsc and perf time,
the testing is located in x86 arch folder. Move this testing out from
x86 arch folder and place it into the common testing folder, so allows
to execute tsc testing on other architectures (e.g. Arm64).

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/arch/x86/include/arch-tests.h | 1 -
tools/perf/arch/x86/tests/Build | 1 -
tools/perf/arch/x86/tests/arch-tests.c | 4 ----
tools/perf/tests/Build | 1 +
tools/perf/tests/builtin-test.c | 4 ++++
tools/perf/{arch/x86 => }/tests/perf-time-to-tsc.c | 0
tools/perf/tests/tests.h | 1 +
7 files changed, 6 insertions(+), 6 deletions(-)
rename tools/perf/{arch/x86 => }/tests/perf-time-to-tsc.c (100%)

diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86/include/arch-tests.h
index c41c5affe4be..6a54b94f1c25 100644
--- a/tools/perf/arch/x86/include/arch-tests.h
+++ b/tools/perf/arch/x86/include/arch-tests.h
@@ -7,7 +7,6 @@ struct test;

/* Tests */
int test__rdpmc(struct test *test __maybe_unused, int subtest);
-int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest);
int test__insn_x86(struct test *test __maybe_unused, int subtest);
int test__intel_pt_pkt_decoder(struct test *test, int subtest);
int test__bp_modify(struct test *test, int subtest);
diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Build
index 2997c506550c..36d4f248b51d 100644
--- a/tools/perf/arch/x86/tests/Build
+++ b/tools/perf/arch/x86/tests/Build
@@ -3,6 +3,5 @@ perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o

perf-y += arch-tests.o
perf-y += rdpmc.o
-perf-y += perf-time-to-tsc.o
perf-$(CONFIG_AUXTRACE) += insn-x86.o intel-pt-pkt-decoder-test.o
perf-$(CONFIG_X86_64) += bp-modify.o
diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c
index 6763135aec17..bc25d727b4e9 100644
--- a/tools/perf/arch/x86/tests/arch-tests.c
+++ b/tools/perf/arch/x86/tests/arch-tests.c
@@ -8,10 +8,6 @@ struct test arch_tests[] = {
.desc = "x86 rdpmc",
.func = test__rdpmc,
},
- {
- .desc = "Convert perf time to TSC",
- .func = test__perf_time_to_tsc,
- },
#ifdef HAVE_DWARF_UNWIND_SUPPORT
{
.desc = "DWARF unwind",
diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build
index 84352fc49a20..3ae29b2a3d0f 100644
--- a/tools/perf/tests/Build
+++ b/tools/perf/tests/Build
@@ -60,6 +60,7 @@ perf-y += api-io.o
perf-y += demangle-java-test.o
perf-y += pfm.o
perf-y += parse-metric.o
+perf-y += perf-time-to-tsc.o

$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index d328caaba45d..67e6a1c6c793 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -341,6 +341,10 @@ static struct test generic_tests[] = {
.desc = "Parse and process metrics",
.func = test__parse_metric,
},
+ {
+ .desc = "Convert perf time to TSC",
+ .func = test__perf_time_to_tsc,
+ },
{
.func = NULL,
},
diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c b/tools/perf/tests/perf-time-to-tsc.c
similarity index 100%
rename from tools/perf/arch/x86/tests/perf-time-to-tsc.c
rename to tools/perf/tests/perf-time-to-tsc.c
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index 4447a516c689..1633f54d6156 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -122,6 +122,7 @@ int test__pfm(struct test *test, int subtest);
const char *test__pfm_subtest_get_desc(int subtest);
int test__pfm_subtest_get_nr(void);
int test__parse_metric(struct test *test, int subtest);
+int test__perf_time_to_tsc(struct test *test, int subtest);

bool test__bp_signal_is_supported(void);
bool test__bp_account_is_supported(void);
--
2.17.1

2020-09-03 02:27:23

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] perf tsc: Add rdtsc() for Arm64

Hi Peter,

On Wed, Sep 02, 2020 at 03:48:05PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 02, 2020 at 02:21:27PM +0100, Leo Yan wrote:
> > The system register CNTVCT_EL0 can be used to retrieve the counter from
> > user space. Add rdtsc() for Arm64.
>
> > +u64 rdtsc(void)
> > +{
> > + u64 val;
>
> Would it make sense to put a comment in that this counter is/could-be
> 'short' ? Because unlike x86-TSC, this thing isn't architecturally
> specified to be 64bits wide.

Will add below comments:

According to ARM DDI 0487F.c, from Armv8.0 to Armv8.5 inclusive, the
system counter is at least 56 bits wide; from Armv8.6, the counter must
be 64 bits wide. So the system counter could be less than 64 bits wide
and it is attributed with the flag 'cap_user_time_short' is true.

Thanks for reviewing,
Leo

>
> > + asm volatile("mrs %0, cntvct_el0" : "=r" (val));
> > +
> > + return val;
> > +}
> > --
> > 2.17.1
> >

2020-09-04 19:10:12

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v3 2/6] perf tsc: Add rdtsc() for Arm64

Em Thu, Sep 03, 2020 at 10:23:54AM +0800, Leo Yan escreveu:
> Hi Peter,
>
> On Wed, Sep 02, 2020 at 03:48:05PM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 02, 2020 at 02:21:27PM +0100, Leo Yan wrote:
> > > The system register CNTVCT_EL0 can be used to retrieve the counter from
> > > user space. Add rdtsc() for Arm64.
> >
> > > +u64 rdtsc(void)
> > > +{
> > > + u64 val;
> >
> > Would it make sense to put a comment in that this counter is/could-be
> > 'short' ? Because unlike x86-TSC, this thing isn't architecturally
> > specified to be 64bits wide.
>
> Will add below comments:
>
> According to ARM DDI 0487F.c, from Armv8.0 to Armv8.5 inclusive, the
> system counter is at least 56 bits wide; from Armv8.6, the counter must
> be 64 bits wide. So the system counter could be less than 64 bits wide
> and it is attributed with the flag 'cap_user_time_short' is true.

Ok, so waiting for v4.

- Arnaldo