When SVE registers are pushed onto the stack the VG register is required to
unwind because the stack offsets would vary by the SVE register width at the
time when the sample was taken.
The first two patches add support for sampling the VG register to the kernel and
the docs. The remaining patches add the support to userspace perf.
A small change is also required to libunwind or libdw depending on which
unwinder is used, and these will be published later. Without these changes Perf
continues to work with both libraries, although the VG register is still not
used for unwinding.
James Clark (6):
perf: arm64: Add SVE vector granule register to user regs
arm64/sve: Add Perf extensions documentation
perf tools: arm64: Copy perf_regs.h from the kernel
perf tools: Use dynamic register set for Dwarf unwind
perf tools: arm64: Decouple Libunwind register names from Perf
perf tools: arm64: Add support for VG register
Documentation/arm64/sve.rst | 20 +++++
arch/arm64/include/uapi/asm/perf_regs.h | 7 +-
arch/arm64/kernel/perf_regs.c | 30 +++++++-
drivers/perf/arm_pmu.c | 2 +-
tools/arch/arm64/include/uapi/asm/perf_regs.h | 7 +-
tools/perf/arch/arm64/util/perf_regs.c | 34 +++++++++
tools/perf/arch/arm64/util/unwind-libunwind.c | 73 +------------------
tools/perf/util/evsel.c | 2 +-
tools/perf/util/perf_regs.c | 2 +
9 files changed, 100 insertions(+), 77 deletions(-)
--
2.28.0
Dwarf based unwinding in a function that pushes SVE registers onto
the stack requires the unwinder to know the length of the SVE register
to calculate the stack offsets correctly. This was added to the Arm
specific Dwarf spec as the VG pseudo register[1].
Add the vector length at position 46 if it's requested by userspace and
SVE is supported. If it's not supported then fail to open the event.
The vector length must be on each sample because it can be changed
at runtime via a prctl or ptrace call. Also by adding it as a register
rather than a separate attribute, minimal changes will be required in an
unwinder that already indexes into the register list.
[1]: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst
Signed-off-by: James Clark <[email protected]>
---
arch/arm64/include/uapi/asm/perf_regs.h | 7 +++++-
arch/arm64/kernel/perf_regs.c | 30 +++++++++++++++++++++++--
drivers/perf/arm_pmu.c | 2 +-
3 files changed, 35 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/uapi/asm/perf_regs.h b/arch/arm64/include/uapi/asm/perf_regs.h
index d54daafa89e3..fd157f46727e 100644
--- a/arch/arm64/include/uapi/asm/perf_regs.h
+++ b/arch/arm64/include/uapi/asm/perf_regs.h
@@ -36,6 +36,11 @@ enum perf_event_arm_regs {
PERF_REG_ARM64_LR,
PERF_REG_ARM64_SP,
PERF_REG_ARM64_PC,
- PERF_REG_ARM64_MAX,
+
+ /* Extended/pseudo registers */
+ PERF_REG_ARM64_VG = 46, // SVE Vector Granule
+
+ PERF_REG_ARM64_MAX = PERF_REG_ARM64_PC + 1,
+ PERF_REG_ARM64_EXTENDED_MAX = PERF_REG_ARM64_VG + 1
};
#endif /* _ASM_ARM64_PERF_REGS_H */
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index f6f58e6265df..b4eece3eb17d 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -9,9 +9,27 @@
#include <asm/perf_regs.h>
#include <asm/ptrace.h>
+static u64 perf_ext_regs_value(int idx)
+{
+ switch (idx) {
+ case PERF_REG_ARM64_VG:
+ if (WARN_ON_ONCE(!system_supports_sve()))
+ return 0;
+
+ /*
+ * Vector granule is current length in bits of SVE registers
+ * divided by 64.
+ */
+ return (task_get_sve_vl(current) * 8) / 64;
+ default:
+ WARN_ON_ONCE(true);
+ return 0;
+ }
+}
+
u64 perf_reg_value(struct pt_regs *regs, int idx)
{
- if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM64_MAX))
+ if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM64_EXTENDED_MAX))
return 0;
/*
@@ -51,6 +69,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
if ((u32)idx == PERF_REG_ARM64_PC)
return regs->pc;
+ if ((u32)idx >= PERF_REG_ARM64_MAX)
+ return perf_ext_regs_value(idx);
+
return regs->regs[idx];
}
@@ -58,7 +79,12 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
int perf_reg_validate(u64 mask)
{
- if (!mask || mask & REG_RESERVED)
+ u64 reserved_mask = REG_RESERVED;
+
+ if (system_supports_sve())
+ reserved_mask &= ~(1ULL << PERF_REG_ARM64_VG);
+
+ if (!mask || mask & reserved_mask)
return -EINVAL;
return 0;
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 59d3980b8ca2..3f07df5a7e95 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -894,7 +894,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
* pmu::filter_match callback and pmu::event_init group
* validation).
*/
- .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS,
+ .capabilities = PERF_PMU_CAP_HETEROGENEOUS_CPUS | PERF_PMU_CAP_EXTENDED_REGS,
};
pmu->attr_groups[ARMPMU_ATTR_GROUP_COMMON] =
--
2.28.0
Document that the VG register is available in Perf samples
Signed-off-by: James Clark <[email protected]>
---
Documentation/arm64/sve.rst | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/Documentation/arm64/sve.rst b/Documentation/arm64/sve.rst
index 9d9a4de5bc34..67e65bf66883 100644
--- a/Documentation/arm64/sve.rst
+++ b/Documentation/arm64/sve.rst
@@ -402,6 +402,24 @@ The regset data starts with struct user_sve_header, containing:
* Modifying the system default vector length does not affect the vector length
of any existing process or thread that does not make an execve() call.
+10. Perf extensions
+--------------------------------
+
+* The arm64 specific DWARF standard [5] added the VG (Vector Granule) register
+ at index 46. This register is used for DWARF unwinding when variable length
+ SVE registers are pushed onto the stack.
+
+* Its value is equivalent to the current vector length (VL) in bits divided by
+ 64.
+
+* The value is included in Perf samples in the regs[46] field if
+ PERF_SAMPLE_REGS_USER is set and the sample_regs_user mask has bit 46 set.
+
+* The value is the current value at the time the sample was taken, and it can
+ change over time.
+
+* If the system doesn't support SVE when perf_event_open is called with these
+ settings, the event will fail to open.
Appendix A. SVE programmer's model (informative)
=================================================
@@ -543,3 +561,5 @@ References
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf
http://infocenter.arm.com/help/topic/com.arm.doc.subset.swdev.abi/index.html
Procedure Call Standard for the ARM 64-bit Architecture (AArch64)
+
+[5] https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst
--
2.28.0
Get the updated header for the newly added VG register.
Signed-off-by: James Clark <[email protected]>
---
tools/arch/arm64/include/uapi/asm/perf_regs.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tools/arch/arm64/include/uapi/asm/perf_regs.h b/tools/arch/arm64/include/uapi/asm/perf_regs.h
index d54daafa89e3..fd157f46727e 100644
--- a/tools/arch/arm64/include/uapi/asm/perf_regs.h
+++ b/tools/arch/arm64/include/uapi/asm/perf_regs.h
@@ -36,6 +36,11 @@ enum perf_event_arm_regs {
PERF_REG_ARM64_LR,
PERF_REG_ARM64_SP,
PERF_REG_ARM64_PC,
- PERF_REG_ARM64_MAX,
+
+ /* Extended/pseudo registers */
+ PERF_REG_ARM64_VG = 46, // SVE Vector Granule
+
+ PERF_REG_ARM64_MAX = PERF_REG_ARM64_PC + 1,
+ PERF_REG_ARM64_EXTENDED_MAX = PERF_REG_ARM64_VG + 1
};
#endif /* _ASM_ARM64_PERF_REGS_H */
--
2.28.0
Architectures can detect availability of extra registers at
runtime so use this more complete set for unwinding. This
will include the VG register on arm64 in a later commit.
If the function isn't implemented then PERF_REGS_MASK is
returned and there is no change.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/util/evsel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d38722560e80..a881784da966 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -875,7 +875,7 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
"specifying a subset with --user-regs may render DWARF unwinding unreliable, "
"so the minimal registers set (IP, SP) is explicitly forced.\n");
} else {
- attr->sample_regs_user |= PERF_REGS_MASK;
+ attr->sample_regs_user |= arch__user_reg_mask();
}
attr->sample_stack_user = param->dump_size;
attr->exclude_callchain_user = 1;
--
2.28.0
Add the name of the VG register so it can be used in --user-regs
The event will fail to open if the register is requested but not
available so only add it to the mask if the kernel supports sve and also
if it supports that specific register.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm64/util/perf_regs.c | 34 ++++++++++++++++++++++++++
tools/perf/util/perf_regs.c | 2 ++
2 files changed, 36 insertions(+)
diff --git a/tools/perf/arch/arm64/util/perf_regs.c b/tools/perf/arch/arm64/util/perf_regs.c
index 476b037eea1c..c0a921512a90 100644
--- a/tools/perf/arch/arm64/util/perf_regs.c
+++ b/tools/perf/arch/arm64/util/perf_regs.c
@@ -2,9 +2,11 @@
#include <errno.h>
#include <regex.h>
#include <string.h>
+#include <sys/auxv.h>
#include <linux/kernel.h>
#include <linux/zalloc.h>
+#include "../../../perf-sys.h"
#include "../../../util/debug.h"
#include "../../../util/event.h"
#include "../../../util/perf_regs.h"
@@ -43,6 +45,7 @@ const struct sample_reg sample_reg_masks[] = {
SMPL_REG(lr, PERF_REG_ARM64_LR),
SMPL_REG(sp, PERF_REG_ARM64_SP),
SMPL_REG(pc, PERF_REG_ARM64_PC),
+ SMPL_REG(vg, PERF_REG_ARM64_VG),
SMPL_REG_END
};
@@ -131,3 +134,34 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
return SDT_ARG_VALID;
}
+
+uint64_t arch__user_reg_mask(void)
+{
+ struct perf_event_attr attr = {
+ .type = PERF_TYPE_HARDWARE,
+ .config = PERF_COUNT_HW_CPU_CYCLES,
+ .sample_type = PERF_SAMPLE_REGS_USER,
+ .disabled = 1,
+ .exclude_kernel = 1,
+ .sample_period = 1,
+ .sample_regs_user = PERF_REGS_MASK
+ };
+ int fd;
+
+ if (getauxval(AT_HWCAP) & HWCAP_SVE)
+ attr.sample_regs_user |= SMPL_REG_MASK(PERF_REG_ARM64_VG);
+
+ /*
+ * Check if the pmu supports perf extended regs, before
+ * returning the register mask to sample.
+ */
+ if (attr.sample_regs_user != PERF_REGS_MASK) {
+ event_attr_init(&attr);
+ fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+ if (fd != -1) {
+ close(fd);
+ return attr.sample_regs_user;
+ }
+ }
+ return PERF_REGS_MASK;
+}
diff --git a/tools/perf/util/perf_regs.c b/tools/perf/util/perf_regs.c
index a982e40ee5a9..872dd3d38782 100644
--- a/tools/perf/util/perf_regs.c
+++ b/tools/perf/util/perf_regs.c
@@ -103,6 +103,8 @@ static const char *__perf_reg_name_arm64(int id)
return "lr";
case PERF_REG_ARM64_PC:
return "pc";
+ case PERF_REG_ARM64_VG:
+ return "vg";
default:
return NULL;
}
--
2.28.0
Dwarf register numbers and real register numbers on aarch64 are
equivalent. Remove the references to the register names from
Libunwind so that new registers are supported without having to
add build time feature checks for each new register.
The unwinder won't ask for a register that it doesn't know about
and Perf will already report an error for an unknown or unrecorded
register in the perf_reg_value() function so extra validation
isn't needed.
After this change the new VG register can be read by libunwind.
Signed-off-by: James Clark <[email protected]>
---
tools/perf/arch/arm64/util/unwind-libunwind.c | 73 +------------------
1 file changed, 2 insertions(+), 71 deletions(-)
diff --git a/tools/perf/arch/arm64/util/unwind-libunwind.c b/tools/perf/arch/arm64/util/unwind-libunwind.c
index 5aecf88e3de6..871af5992298 100644
--- a/tools/perf/arch/arm64/util/unwind-libunwind.c
+++ b/tools/perf/arch/arm64/util/unwind-libunwind.c
@@ -10,77 +10,8 @@
int LIBUNWIND__ARCH_REG_ID(int regnum)
{
- switch (regnum) {
- case UNW_AARCH64_X0:
- return PERF_REG_ARM64_X0;
- case UNW_AARCH64_X1:
- return PERF_REG_ARM64_X1;
- case UNW_AARCH64_X2:
- return PERF_REG_ARM64_X2;
- case UNW_AARCH64_X3:
- return PERF_REG_ARM64_X3;
- case UNW_AARCH64_X4:
- return PERF_REG_ARM64_X4;
- case UNW_AARCH64_X5:
- return PERF_REG_ARM64_X5;
- case UNW_AARCH64_X6:
- return PERF_REG_ARM64_X6;
- case UNW_AARCH64_X7:
- return PERF_REG_ARM64_X7;
- case UNW_AARCH64_X8:
- return PERF_REG_ARM64_X8;
- case UNW_AARCH64_X9:
- return PERF_REG_ARM64_X9;
- case UNW_AARCH64_X10:
- return PERF_REG_ARM64_X10;
- case UNW_AARCH64_X11:
- return PERF_REG_ARM64_X11;
- case UNW_AARCH64_X12:
- return PERF_REG_ARM64_X12;
- case UNW_AARCH64_X13:
- return PERF_REG_ARM64_X13;
- case UNW_AARCH64_X14:
- return PERF_REG_ARM64_X14;
- case UNW_AARCH64_X15:
- return PERF_REG_ARM64_X15;
- case UNW_AARCH64_X16:
- return PERF_REG_ARM64_X16;
- case UNW_AARCH64_X17:
- return PERF_REG_ARM64_X17;
- case UNW_AARCH64_X18:
- return PERF_REG_ARM64_X18;
- case UNW_AARCH64_X19:
- return PERF_REG_ARM64_X19;
- case UNW_AARCH64_X20:
- return PERF_REG_ARM64_X20;
- case UNW_AARCH64_X21:
- return PERF_REG_ARM64_X21;
- case UNW_AARCH64_X22:
- return PERF_REG_ARM64_X22;
- case UNW_AARCH64_X23:
- return PERF_REG_ARM64_X23;
- case UNW_AARCH64_X24:
- return PERF_REG_ARM64_X24;
- case UNW_AARCH64_X25:
- return PERF_REG_ARM64_X25;
- case UNW_AARCH64_X26:
- return PERF_REG_ARM64_X26;
- case UNW_AARCH64_X27:
- return PERF_REG_ARM64_X27;
- case UNW_AARCH64_X28:
- return PERF_REG_ARM64_X28;
- case UNW_AARCH64_X29:
- return PERF_REG_ARM64_X29;
- case UNW_AARCH64_X30:
- return PERF_REG_ARM64_LR;
- case UNW_AARCH64_SP:
- return PERF_REG_ARM64_SP;
- case UNW_AARCH64_PC:
- return PERF_REG_ARM64_PC;
- default:
- pr_err("unwind: invalid reg id %d\n", regnum);
+ if (regnum < 0 || regnum >= PERF_REG_ARM64_EXTENDED_MAX)
return -EINVAL;
- }
- return -EINVAL;
+ return regnum;
}
--
2.28.0
On Mon, May 09, 2022 at 03:42:50PM +0100, James Clark wrote:
> +* Its value is equivalent to the current vector length (VL) in bits divided by
> + 64.
Please explicitly say that this is the current *SVE* vector length,
given that with SME entering streaming mode means we have SVE registers
with the current streaming vector length which may be different to the
SVE vector length it is possible that someone may read the above as
referring to the vector length that applies to the current Z/P registers.
On Mon, May 09, 2022 at 03:42:49PM +0100, James Clark wrote:
> Dwarf based unwinding in a function that pushes SVE registers onto
> the stack requires the unwinder to know the length of the SVE register
> to calculate the stack offsets correctly. This was added to the Arm
> specific Dwarf spec as the VG pseudo register[1].
>
> Add the vector length at position 46 if it's requested by userspace and
> SVE is supported. If it's not supported then fail to open the event.
>
> The vector length must be on each sample because it can be changed
> at runtime via a prctl or ptrace call. Also by adding it as a register
> rather than a separate attribute, minimal changes will be required in an
> unwinder that already indexes into the register list.
> +static u64 perf_ext_regs_value(int idx)
> +{
> + switch (idx) {
> + case PERF_REG_ARM64_VG:
> + if (WARN_ON_ONCE(!system_supports_sve()))
> + return 0;
These WARN_ON_ONCE()s seem a bit loud but I do see they are idiomatic
for this code so
Reviewed-by: Mark Brown <[email protected]>
On 09/05/2022 16:48, Mark Brown wrote:
> On Mon, May 09, 2022 at 03:42:49PM +0100, James Clark wrote:
>> Dwarf based unwinding in a function that pushes SVE registers onto
>> the stack requires the unwinder to know the length of the SVE register
>> to calculate the stack offsets correctly. This was added to the Arm
>> specific Dwarf spec as the VG pseudo register[1].
>>
>> Add the vector length at position 46 if it's requested by userspace and
>> SVE is supported. If it's not supported then fail to open the event.
>>
>> The vector length must be on each sample because it can be changed
>> at runtime via a prctl or ptrace call. Also by adding it as a register
>> rather than a separate attribute, minimal changes will be required in an
>> unwinder that already indexes into the register list.
>
>> +static u64 perf_ext_regs_value(int idx)
>> +{
>> + switch (idx) {
>> + case PERF_REG_ARM64_VG:
>> + if (WARN_ON_ONCE(!system_supports_sve()))
>> + return 0;
>
> These WARN_ON_ONCE()s seem a bit loud but I do see they are idiomatic
> for this code so
They should never ever be hit because the mask is validated when opening
the event so hopefully it's not an issue.
>
> Reviewed-by: Mark Brown <[email protected]>
Thanks Mark
Em Mon, May 09, 2022 at 03:42:52PM +0100, James Clark escreveu:
> Architectures can detect availability of extra registers at
> runtime so use this more complete set for unwinding. This
> will include the VG register on arm64 in a later commit.
>
> If the function isn't implemented then PERF_REGS_MASK is
> returned and there is no change.
Committer notes:
Added util/perf_regs.c to tools/perf/util/python-ext-sources so that
'perf test python' passes, i.e. the perf python binding has all the
symbols it needs, addressing:
$ perf test -v python
19: 'import perf' in python :
--- start ---
test child forked, pid 2037817
python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so: undefined symbol: arch__user_reg_mask
test child finished with -1
---- end ----
'import perf' in python: FAILED!
$
> Signed-off-by: James Clark <[email protected]>
> ---
> tools/perf/util/evsel.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d38722560e80..a881784da966 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -875,7 +875,7 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
> "specifying a subset with --user-regs may render DWARF unwinding unreliable, "
> "so the minimal registers set (IP, SP) is explicitly forced.\n");
> } else {
> - attr->sample_regs_user |= PERF_REGS_MASK;
> + attr->sample_regs_user |= arch__user_reg_mask();
> }
> attr->sample_stack_user = param->dump_size;
> attr->exclude_callchain_user = 1;
> --
> 2.28.0
--
- Arnaldo
Em Thu, May 26, 2022 at 12:44:33PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, May 09, 2022 at 03:42:52PM +0100, James Clark escreveu:
> > Architectures can detect availability of extra registers at
> > runtime so use this more complete set for unwinding. This
> > will include the VG register on arm64 in a later commit.
> >
> > If the function isn't implemented then PERF_REGS_MASK is
> > returned and there is no change.
>
> Committer notes:
>
> Added util/perf_regs.c to tools/perf/util/python-ext-sources so that
> 'perf test python' passes, i.e. the perf python binding has all the
> symbols it needs, addressing:
>
> $ perf test -v python
> 19: 'import perf' in python :
> --- start ---
> test child forked, pid 2037817
> python usage test: "echo "import sys ; sys.path.append('/tmp/build/perf/python'); import perf" | '/usr/bin/python3' "
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ImportError: /tmp/build/perf/python/perf.cpython-310-x86_64-linux-gnu.so: undefined symbol: arch__user_reg_mask
> test child finished with -1
> ---- end ----
> 'import perf' in python: FAILED!
> $
Too old to support?
69 7.19 ubuntu:16.04-x-arm64 : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
arch/arm64/util/perf_regs.c: In function 'arch__user_reg_mask':
arch/arm64/util/perf_regs.c:151:28: error: 'HWCAP_SVE' undeclared (first use in this function)
if (getauxval(AT_HWCAP) & HWCAP_SVE)
^
arch/arm64/util/perf_regs.c:151:28: note: each undeclared identifier is reported only once for each function it appears in
/git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'util' failed
make[5]: *** [util] Error 2
/git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arm64' failed
make[4]: *** [arm64] Error 2
/git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arch' failed
make[3]: *** [arch] Error 2
⬢[acme@toolbox perf]$ find . -name "*.h" | xargs grep -w HWCAP_SVE
./arch/arm64/include/uapi/asm/hwcap.h:#define HWCAP_SVE (1 << 22)
⬢[acme@toolbox perf]$
> > Signed-off-by: James Clark <[email protected]>
> > ---
> > tools/perf/util/evsel.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index d38722560e80..a881784da966 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -875,7 +875,7 @@ static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *o
> > "specifying a subset with --user-regs may render DWARF unwinding unreliable, "
> > "so the minimal registers set (IP, SP) is explicitly forced.\n");
> > } else {
> > - attr->sample_regs_user |= PERF_REGS_MASK;
> > + attr->sample_regs_user |= arch__user_reg_mask();
> > }
> > attr->sample_stack_user = param->dump_size;
> > attr->exclude_callchain_user = 1;
> > --
> > 2.28.0
>
> --
>
> - Arnaldo
--
- Arnaldo
On Thu, May 26, 2022 at 03:19:54PM -0300, Arnaldo Carvalho de Melo wrote:
[...]
> Too old to support?
>
> 69 7.19 ubuntu:16.04-x-arm64 : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
> arch/arm64/util/perf_regs.c: In function 'arch__user_reg_mask':
> arch/arm64/util/perf_regs.c:151:28: error: 'HWCAP_SVE' undeclared (first use in this function)
> if (getauxval(AT_HWCAP) & HWCAP_SVE)
> ^
> arch/arm64/util/perf_regs.c:151:28: note: each undeclared identifier is reported only once for each function it appears in
> /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[5]: *** [util] Error 2
> /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arm64' failed
> make[4]: *** [arm64] Error 2
> /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arch' failed
> make[3]: *** [arch] Error 2
>
>
> ⬢[acme@toolbox perf]$ find . -name "*.h" | xargs grep -w HWCAP_SVE
> ./arch/arm64/include/uapi/asm/hwcap.h:#define HWCAP_SVE (1 << 22)
> ⬢[acme@toolbox perf]$
I tested aarch64 GCC-7.4.1 which doesn't support HWCAP_SVE, but
aarch64 GCC-8.3.0 and GCC-9.4.0 support it.
Either we can add below code:
#ifndef HWCAP_SVE
#define HWCAP_SVE (1 << 22)
#endif
Or directly include header file <.../asm/hwcap.h>.
Not sure which method is preferred. Maybe the first approach can be
de-couple with Linux kernel code?
Thanks,
Leo
Em Fri, May 27, 2022 at 02:18:54PM +0800, Leo Yan escreveu:
> On Thu, May 26, 2022 at 03:19:54PM -0300, Arnaldo Carvalho de Melo wrote:
>
> [...]
>
> > Too old to support?
> >
> > 69 7.19 ubuntu:16.04-x-arm64 : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
> > arch/arm64/util/perf_regs.c: In function 'arch__user_reg_mask':
> > arch/arm64/util/perf_regs.c:151:28: error: 'HWCAP_SVE' undeclared (first use in this function)
> > if (getauxval(AT_HWCAP) & HWCAP_SVE)
> > ^
> > arch/arm64/util/perf_regs.c:151:28: note: each undeclared identifier is reported only once for each function it appears in
> > /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'util' failed
> > make[5]: *** [util] Error 2
> > /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arm64' failed
> > make[4]: *** [arm64] Error 2
> > /git/perf-5.18.0/tools/build/Makefile.build:139: recipe for target 'arch' failed
> > make[3]: *** [arch] Error 2
> >
> >
> > ⬢[acme@toolbox perf]$ find . -name "*.h" | xargs grep -w HWCAP_SVE
> > ./arch/arm64/include/uapi/asm/hwcap.h:#define HWCAP_SVE (1 << 22)
> > ⬢[acme@toolbox perf]$
>
> I tested aarch64 GCC-7.4.1 which doesn't support HWCAP_SVE, but
> aarch64 GCC-8.3.0 and GCC-9.4.0 support it.
>
> Either we can add below code:
>
> #ifndef HWCAP_SVE
> #define HWCAP_SVE (1 << 22)
> #endif
>
> Or directly include header file <.../asm/hwcap.h>.
>
> Not sure which method is preferred. Maybe the first approach can be
> de-couple with Linux kernel code?
Lets go KISS and just define it if not present, as you suggested above,
will test now.
- Arnaldo