This patch series addresses issues that perf has when attempting to show
userspace stacks in the presence of pointer authentication on arm64.
Depending on whether libunwind or libdw is used, perf incorrectly
displays the userspace stack in 'perf report --stdio'. With libunwind,
only the leaf function is shown.
|
---0x200000004005bf
0x200000004005bf
my_leaf_function
With libdw, only the leaf function is shown even though there are
callers in the application.
|
---my_leaf_function
The reason perf cannot show the stack upon a perf report --stdio is
because the unwinders are given instruction pointers which contain a
pointer authentication code (PAC). For the libraries to correctly
unwind, they need to know which bits of the instruction pointer to turn
off.
The kernel exposes the set of PAC bits via the NT_ARM_PAC_MASK regset.
It is expected that this may vary per-task in future. The kernel also
exposes which pointer authentication keys are enabled via the
NT_ARM_PAC_ENABLED_KEYS regset, and this can change dynamically. These
are per-task state which perf would need to sample.
It's not always feasible for perf to acquire these regsets via ptrace.
When sampling system-wide or with inherited events this may require a
large volume of ptrace requests, and by the time the perf tool processes
a sample for a task, that task might already have terminated.
Instead, these patches allow this state to be sampled into the perf
ringbuffer, where it can be consumed more easily by the perf tool.
The first patch changes the kernel to send the authentication PAC masks
to userspace perf via the perf ring buffer. This is published in the
sample, using a new sample field PERF_SAMPLE_ARCH_1.
The subsequent patches are changes to userspace perf to
1) request the PERF_SAMPLE_ARCH_1
2) supply the instruction mask to libunwind
3) ensure perf can cope with an older kernel that does not know about
the PERF_SAMPLE_ARCH_1 sample field.
4) checks if the version of libunwind has the capability to accept
an instruction mask from perf and if so enable the feature.
These changes depend on a change to libunwind, that is yet to be
released, although the patch has been merged.
https://github.com/libunwind/libunwind/pull/360
Andrew Kilroy (6):
perf arm64: Send pointer auth masks to ring buffer
perf evsel: Do not request ptrauth sample field if not supported
perf tools: arm64: Read ptrauth data from kernel
perf libunwind: Feature check for libunwind ptrauth callback
perf libunwind: arm64 pointer authentication
perf tools: Print ptrauth struct in perf report
German Gomez (2):
perf test: Update arm64 tests to expect ptrauth masks
perf test arm64: Test unwinding with PACs on gcc & clang compilers
arch/arm64/include/asm/arch_sample_data.h | 38 ++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/arch_sample_data.c | 37 ++++++
include/linux/perf_event.h | 24 ++++
include/uapi/linux/perf_event.h | 5 +-
kernel/events/core.c | 35 ++++++
tools/build/Makefile.feature | 2 +
tools/build/feature/Makefile | 4 +
tools/build/feature/test-all.c | 5 +
.../feature/test-libunwind-arm64-ptrauth.c | 26 ++++
tools/include/uapi/linux/perf_event.h | 5 +-
tools/perf/Makefile.config | 10 ++
tools/perf/Makefile.perf | 1 +
tools/perf/tests/Build | 1 +
tools/perf/tests/arm_unwind_pac.c | 113 ++++++++++++++++++
tools/perf/tests/arm_unwind_pac.sh | 57 +++++++++
tools/perf/tests/attr/README | 1 +
.../attr/test-record-graph-default-aarch64 | 3 +-
tools/perf/tests/attr/test-record-graph-dwarf | 1 +
.../attr/test-record-graph-dwarf-aarch64 | 13 ++
.../tests/attr/test-record-graph-fp-aarch64 | 3 +-
tools/perf/tests/builtin-test.c | 1 +
tools/perf/tests/sample-parsing.c | 2 +-
tools/perf/tests/tests.h | 1 +
tools/perf/util/event.h | 8 ++
tools/perf/util/evsel.c | 64 ++++++++++
tools/perf/util/evsel.h | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 2 +-
tools/perf/util/session.c | 15 +++
tools/perf/util/unwind-libunwind-local.c | 12 ++
30 files changed, 485 insertions(+), 7 deletions(-)
create mode 100644 arch/arm64/include/asm/arch_sample_data.h
create mode 100644 arch/arm64/kernel/arch_sample_data.c
create mode 100644 tools/build/feature/test-libunwind-arm64-ptrauth.c
create mode 100644 tools/perf/tests/arm_unwind_pac.c
create mode 100755 tools/perf/tests/arm_unwind_pac.sh
create mode 100644 tools/perf/tests/attr/test-record-graph-dwarf-aarch64
--
2.17.1
This patch prints a perf sample's ptrauth struct so that the PAC masks
can be seen. To aid debugging.
Signed-off-by: Andrew Kilroy <[email protected]>
---
tools/perf/util/session.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 37f833c3c81b..6b56e638d4dd 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1329,6 +1329,13 @@ char *get_page_size_name(u64 size, char *str)
return str;
}
+static void ptrauth__printf(struct ptrauth_info *ptrauth)
+{
+ printf(" . ptrauth enabled keys: 0x%016"PRIx64"\n", ptrauth->enabled_keys);
+ printf(" . ptrauth instruction mask: 0x%016"PRIx64"\n", ptrauth->insn_mask);
+ printf(" . ptrauth data mask: 0x%016"PRIx64"\n", ptrauth->data_mask);
+}
+
static void dump_sample(struct evsel *evsel, union perf_event *event,
struct perf_sample *sample, const char *arch)
{
@@ -1385,6 +1392,14 @@ static void dump_sample(struct evsel *evsel, union perf_event *event,
if (sample_type & PERF_SAMPLE_READ)
sample_read__printf(sample, evsel->core.attr.read_format);
+
+ if (sample_type & PERF_SAMPLE_ARCH_1) {
+ const char *normlzd_arch = perf_env__arch(evsel->evlist->env);
+
+ if (normlzd_arch && strcmp(normlzd_arch, "arm64") == 0)
+ ptrauth__printf(&sample->ptrauth);
+ }
+
}
static void dump_read(struct evsel *evsel, union perf_event *event)
--
2.17.1
From: German Gomez <[email protected]>
We will request the pointer auth masks in a followup commit, so take the
opportunity to update the relevant tests.
Signed-off-by: German Gomez <[email protected]>
Signed-off-by: Andrew Kilroy <[email protected]>
---
tools/perf/tests/attr/README | 1 +
.../tests/attr/test-record-graph-default-aarch64 | 3 ++-
tools/perf/tests/attr/test-record-graph-dwarf | 1 +
.../perf/tests/attr/test-record-graph-dwarf-aarch64 | 13 +++++++++++++
tools/perf/tests/attr/test-record-graph-fp-aarch64 | 3 ++-
5 files changed, 19 insertions(+), 2 deletions(-)
create mode 100644 tools/perf/tests/attr/test-record-graph-dwarf-aarch64
diff --git a/tools/perf/tests/attr/README b/tools/perf/tests/attr/README
index eb3f7d4bb324..9d7f4646920f 100644
--- a/tools/perf/tests/attr/README
+++ b/tools/perf/tests/attr/README
@@ -47,6 +47,7 @@ Following tests are defined (with perf commands):
perf record -g kill (test-record-graph-default)
perf record -g kill (test-record-graph-default-aarch64)
perf record --call-graph dwarf kill (test-record-graph-dwarf)
+ perf record --call-graph dwarf kill (test-record-graph-dwarf-aarch64)
perf record --call-graph fp kill (test-record-graph-fp)
perf record --call-graph fp kill (test-record-graph-fp-aarch64)
perf record --group -e cycles,instructions kill (test-record-group)
diff --git a/tools/perf/tests/attr/test-record-graph-default-aarch64 b/tools/perf/tests/attr/test-record-graph-default-aarch64
index e98d62efb6f7..948d41c162aa 100644
--- a/tools/perf/tests/attr/test-record-graph-default-aarch64
+++ b/tools/perf/tests/attr/test-record-graph-default-aarch64
@@ -5,5 +5,6 @@ ret = 1
arch = aarch64
[event:base-record]
-sample_type=4391
+# handle both with and without ARM64_PTRAUTH
+sample_type=4391|33558823
sample_regs_user=1073741824
diff --git a/tools/perf/tests/attr/test-record-graph-dwarf b/tools/perf/tests/attr/test-record-graph-dwarf
index ae92061d611d..619bccd886c4 100644
--- a/tools/perf/tests/attr/test-record-graph-dwarf
+++ b/tools/perf/tests/attr/test-record-graph-dwarf
@@ -2,6 +2,7 @@
command = record
args = --no-bpf-event --call-graph dwarf -- kill >/dev/null 2>&1
ret = 1
+arch = !aarch64
[event:base-record]
sample_type=45359
diff --git a/tools/perf/tests/attr/test-record-graph-dwarf-aarch64 b/tools/perf/tests/attr/test-record-graph-dwarf-aarch64
new file mode 100644
index 000000000000..daec43b39e2e
--- /dev/null
+++ b/tools/perf/tests/attr/test-record-graph-dwarf-aarch64
@@ -0,0 +1,13 @@
+[config]
+command = record
+args = --no-bpf-event --call-graph dwarf -- kill >/dev/null 2>&1
+ret = 1
+arch = aarch64
+
+[event:base-record]
+# handle both with and without ARM64_PTRAUTH
+sample_type=45359|33599791
+exclude_callchain_user=1
+sample_stack_user=8192
+sample_regs_user=*
+mmap_data=1
diff --git a/tools/perf/tests/attr/test-record-graph-fp-aarch64 b/tools/perf/tests/attr/test-record-graph-fp-aarch64
index cbeea9971285..bc0880f71e8e 100644
--- a/tools/perf/tests/attr/test-record-graph-fp-aarch64
+++ b/tools/perf/tests/attr/test-record-graph-fp-aarch64
@@ -5,5 +5,6 @@ ret = 1
arch = aarch64
[event:base-record]
-sample_type=4391
+# handle both with and without ARM64_PTRAUTH
+sample_type=4391|33558823
sample_regs_user=1073741824
--
2.17.1
On 04/07/2022 15:53, Andrew Kilroy wrote:
> This patch series addresses issues that perf has when attempting to show
> userspace stacks in the presence of pointer authentication on arm64.
>
> Depending on whether libunwind or libdw is used, perf incorrectly
> displays the userspace stack in 'perf report --stdio'. With libunwind,
> only the leaf function is shown.
>
> |
> ---0x200000004005bf
> 0x200000004005bf
> my_leaf_function
>
> With libdw, only the leaf function is shown even though there are
> callers in the application.
>
> |
> ---my_leaf_function
>
>
> The reason perf cannot show the stack upon a perf report --stdio is
> because the unwinders are given instruction pointers which contain a
> pointer authentication code (PAC). For the libraries to correctly
> unwind, they need to know which bits of the instruction pointer to turn
> off.
>
> The kernel exposes the set of PAC bits via the NT_ARM_PAC_MASK regset.
> It is expected that this may vary per-task in future. The kernel also
> exposes which pointer authentication keys are enabled via the
> NT_ARM_PAC_ENABLED_KEYS regset, and this can change dynamically. These
> are per-task state which perf would need to sample.
>
> It's not always feasible for perf to acquire these regsets via ptrace.
> When sampling system-wide or with inherited events this may require a
> large volume of ptrace requests, and by the time the perf tool processes
> a sample for a task, that task might already have terminated.
>
> Instead, these patches allow this state to be sampled into the perf
> ringbuffer, where it can be consumed more easily by the perf tool.
>
> The first patch changes the kernel to send the authentication PAC masks
> to userspace perf via the perf ring buffer. This is published in the
> sample, using a new sample field PERF_SAMPLE_ARCH_1.
>
> The subsequent patches are changes to userspace perf to
>
> 1) request the PERF_SAMPLE_ARCH_1
> 2) supply the instruction mask to libunwind
> 3) ensure perf can cope with an older kernel that does not know about
> the PERF_SAMPLE_ARCH_1 sample field.
> 4) checks if the version of libunwind has the capability to accept
> an instruction mask from perf and if so enable the feature.
>
> These changes depend on a change to libunwind, that is yet to be
> released, although the patch has been merged.
>
> https://github.com/libunwind/libunwind/pull/360
>
For the whole set:
Reviewed-by: James Clark <[email protected]>
I checked that the new test passes on an AWS Graviton 3 instance and
with a build of mainline libunwind. I also checked that the PAC masks on
the samples look sensible.
The tests also still pass when run on N1SDP which doesn't have pointer
authentication.
>
> Andrew Kilroy (6):
> perf arm64: Send pointer auth masks to ring buffer
> perf evsel: Do not request ptrauth sample field if not supported
> perf tools: arm64: Read ptrauth data from kernel
> perf libunwind: Feature check for libunwind ptrauth callback
> perf libunwind: arm64 pointer authentication
> perf tools: Print ptrauth struct in perf report
>
> German Gomez (2):
> perf test: Update arm64 tests to expect ptrauth masks
> perf test arm64: Test unwinding with PACs on gcc & clang compilers
>
> arch/arm64/include/asm/arch_sample_data.h | 38 ++++++
> arch/arm64/kernel/Makefile | 2 +-
> arch/arm64/kernel/arch_sample_data.c | 37 ++++++
> include/linux/perf_event.h | 24 ++++
> include/uapi/linux/perf_event.h | 5 +-
> kernel/events/core.c | 35 ++++++
> tools/build/Makefile.feature | 2 +
> tools/build/feature/Makefile | 4 +
> tools/build/feature/test-all.c | 5 +
> .../feature/test-libunwind-arm64-ptrauth.c | 26 ++++
> tools/include/uapi/linux/perf_event.h | 5 +-
> tools/perf/Makefile.config | 10 ++
> tools/perf/Makefile.perf | 1 +
> tools/perf/tests/Build | 1 +
> tools/perf/tests/arm_unwind_pac.c | 113 ++++++++++++++++++
> tools/perf/tests/arm_unwind_pac.sh | 57 +++++++++
> tools/perf/tests/attr/README | 1 +
> .../attr/test-record-graph-default-aarch64 | 3 +-
> tools/perf/tests/attr/test-record-graph-dwarf | 1 +
> .../attr/test-record-graph-dwarf-aarch64 | 13 ++
> .../tests/attr/test-record-graph-fp-aarch64 | 3 +-
> tools/perf/tests/builtin-test.c | 1 +
> tools/perf/tests/sample-parsing.c | 2 +-
> tools/perf/tests/tests.h | 1 +
> tools/perf/util/event.h | 8 ++
> tools/perf/util/evsel.c | 64 ++++++++++
> tools/perf/util/evsel.h | 1 +
> tools/perf/util/perf_event_attr_fprintf.c | 2 +-
> tools/perf/util/session.c | 15 +++
> tools/perf/util/unwind-libunwind-local.c | 12 ++
> 30 files changed, 485 insertions(+), 7 deletions(-)
> create mode 100644 arch/arm64/include/asm/arch_sample_data.h
> create mode 100644 arch/arm64/kernel/arch_sample_data.c
> create mode 100644 tools/build/feature/test-libunwind-arm64-ptrauth.c
> create mode 100644 tools/perf/tests/arm_unwind_pac.c
> create mode 100755 tools/perf/tests/arm_unwind_pac.sh
> create mode 100644 tools/perf/tests/attr/test-record-graph-dwarf-aarch64
>