2023-07-25 21:41:52

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 0/2] Fix 'faddr2line' for LLVM arm64 builds

Hi folks,

This is version two of the patches I sent yesterday attempting to fix
'faddr2line' for LLVM arm64 kernel images.

v1: https://lore.kernel.org/r/[email protected]

Changes since v1 include:
* Dropped the patch adding support for LLVM=1, since Josh said he'd
pick it up.
* Reuse the ignored symbol regex from 'mksysmap' instead of ignoring
symbols based on their type.

Feedback welcome. I've checked that the symbols in System.map for a
defconfig arm64 build are the same with and without these changes, but I
think I'd still like this to spend time in -next if we go down this
route.

Cheers,

Will

Cc: Masahiro Yamada <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: John Stultz <[email protected]>
Cc: [email protected]

--->8

Will Deacon (2):
scripts/mksysmap: Factor out sed ignored symbols expression into
script
scripts/faddr2line: Constrain readelf output to symbols from
System.map

scripts/faddr2line | 3 +-
scripts/mksysmap | 77 +--------------------------------
scripts/sysmap-ignored-syms.sed | 74 +++++++++++++++++++++++++++++++
3 files changed, 77 insertions(+), 77 deletions(-)
create mode 100644 scripts/sysmap-ignored-syms.sed

--
2.41.0.487.g6d72f3e995-goog



2023-07-25 21:42:01

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 2/2] scripts/faddr2line: Constrain readelf output to symbols from System.map

Some symbols emitted in the readelf output but filtered from System.map
can confuse the 'faddr2line' symbol size calculation, resulting in the
erroneous rejection of valid offsets. This is especially prevalent when
building an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions
are prefixed with a 32-bit data value in a '$d.n' section. For example:

447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall
104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73
106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75
111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78
112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79
36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process

Adding a warning to do_one_initcall() results in:

| WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260

Which 'faddr2line' refuses to accept:

$ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
no match for do_one_initcall+0xf4/0x260

Filter out entries from readelf using the 'sysmap-ignored-syms.sed'
script used to construct System.map, so that the size of a symbol is
calculated as a delta to the next symbol present in ksymtab.

Cc: Josh Poimboeuf <[email protected]>
Cc: John Stultz <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
scripts/faddr2line | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 62a3fa6f6f59..da734af90036 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -64,6 +64,7 @@ else
UTIL_PREFIX=${CROSS_COMPILE:-}
fi

+IGNORED_SYMS=$(dirname $0)/sysmap-ignored-syms.sed
READELF="${UTIL_PREFIX}readelf"
ADDR2LINE="${UTIL_PREFIX}addr2line"
AWK="awk"
@@ -185,7 +186,7 @@ __faddr2line() {
found=2
break
fi
- done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
+ done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)

if [[ $found = 0 ]]; then
warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
--
2.41.0.487.g6d72f3e995-goog


2023-07-25 21:49:47

by Will Deacon

[permalink] [raw]
Subject: [PATCH v2 1/2] scripts/mksysmap: Factor out sed ignored symbols expression into script

To prepare for 'faddr2line' reusing the same ignored symbols list as
'mksysmap', factor out the relevant sed expression into its own script,
removing the double-escapes for '$' symbols as they are no longer
required.

Cc: Masahiro Yamada <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: John Stultz <[email protected]>
Cc: [email protected]
Signed-off-by: Will Deacon <[email protected]>
---
scripts/mksysmap | 77 +--------------------------------
scripts/sysmap-ignored-syms.sed | 74 +++++++++++++++++++++++++++++++
2 files changed, 75 insertions(+), 76 deletions(-)
create mode 100644 scripts/sysmap-ignored-syms.sed

diff --git a/scripts/mksysmap b/scripts/mksysmap
index 9ba1c9da0a40..a98b34363258 100755
--- a/scripts/mksysmap
+++ b/scripts/mksysmap
@@ -16,7 +16,7 @@
# 'W' or 'w'.
#

-${NM} -n ${1} | sed >${2} -e "
+${NM} -n ${1} | sed >${2} -f $(dirname $0)/sysmap-ignored-syms.sed -e "
# ---------------------------------------------------------------------------
# Ignored symbol types
#
@@ -27,81 +27,6 @@ ${NM} -n ${1} | sed >${2} -e "
# w: local weak symbols
/ [aNUw] /d

-# ---------------------------------------------------------------------------
-# Ignored prefixes
-# (do not forget a space before each pattern)
-
-# local symbols for ARM, MIPS, etc.
-/ \\$/d
-
-# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
-/ \.L/d
-
-# arm64 EFI stub namespace
-/ __efistub_/d
-
-# arm64 local symbols in PIE namespace
-/ __pi_\\$/d
-/ __pi_\.L/d
-
-# arm64 local symbols in non-VHE KVM namespace
-/ __kvm_nvhe_\\$/d
-/ __kvm_nvhe_\.L/d
-
-# arm64 lld
-/ __AArch64ADRPThunk_/d
-
-# arm lld
-/ __ARMV5PILongThunk_/d
-/ __ARMV7PILongThunk_/d
-/ __ThumbV7PILongThunk_/d
-
-# mips lld
-/ __LA25Thunk_/d
-/ __microLA25Thunk_/d
-
-# CFI type identifiers
-/ __kcfi_typeid_/d
-/ __kvm_nvhe___kcfi_typeid_/d
-/ __pi___kcfi_typeid_/d
-
-# CRC from modversions
-/ __crc_/d
-
-# EXPORT_SYMBOL (symbol name)
-/ __kstrtab_/d
-
-# EXPORT_SYMBOL (namespace)
-/ __kstrtabns_/d
-
-# ---------------------------------------------------------------------------
-# Ignored suffixes
-# (do not forget '$' after each pattern)
-
-# arm
-/_from_arm$/d
-/_from_thumb$/d
-/_veneer$/d
-
-# ---------------------------------------------------------------------------
-# Ignored symbols (exact match)
-# (do not forget a space before and '$' after each pattern)
-
-# for LoongArch?
-/ L0$/d
-
-# ppc
-/ _SDA_BASE_$/d
-/ _SDA2_BASE_$/d
-
-# ---------------------------------------------------------------------------
-# Ignored patterns
-# (symbols that contain the pattern are ignored)
-
-# ppc stub
-/\.long_branch\./d
-/\.plt_branch\./d
-
# ---------------------------------------------------------------------------
# Ignored kallsyms symbols
#
diff --git a/scripts/sysmap-ignored-syms.sed b/scripts/sysmap-ignored-syms.sed
new file mode 100644
index 000000000000..14b9eb2c9ed9
--- /dev/null
+++ b/scripts/sysmap-ignored-syms.sed
@@ -0,0 +1,74 @@
+# ---------------------------------------------------------------------------
+# Ignored prefixes
+# (do not forget a space before each pattern)
+
+# local symbols for ARM, MIPS, etc.
+/ \$/d
+
+# local labels, .LBB, .Ltmpxxx, .L__unnamed_xx, .LASANPC, etc.
+/ \.L/d
+
+# arm64 EFI stub namespace
+/ __efistub_/d
+
+# arm64 local symbols in PIE namespace
+/ __pi_\$/d
+/ __pi_\.L/d
+
+# arm64 local symbols in non-VHE KVM namespace
+/ __kvm_nvhe_\$/d
+/ __kvm_nvhe_\.L/d
+
+# arm64 lld
+/ __AArch64ADRPThunk_/d
+
+# arm lld
+/ __ARMV5PILongThunk_/d
+/ __ARMV7PILongThunk_/d
+/ __ThumbV7PILongThunk_/d
+
+# mips lld
+/ __LA25Thunk_/d
+/ __microLA25Thunk_/d
+
+# CFI type identifiers
+/ __kcfi_typeid_/d
+/ __kvm_nvhe___kcfi_typeid_/d
+/ __pi___kcfi_typeid_/d
+
+# CRC from modversions
+/ __crc_/d
+
+# EXPORT_SYMBOL (symbol name)
+/ __kstrtab_/d
+
+# EXPORT_SYMBOL (namespace)
+/ __kstrtabns_/d
+
+# ---------------------------------------------------------------------------
+# Ignored suffixes
+# (do not forget '$' after each pattern)
+
+# arm
+/_from_arm$/d
+/_from_thumb$/d
+/_veneer$/d
+
+# ---------------------------------------------------------------------------
+# Ignored symbols (exact match)
+# (do not forget a space before and '$' after each pattern)
+
+# for LoongArch?
+/ L0$/d
+
+# ppc
+/ _SDA_BASE_$/d
+/ _SDA2_BASE_$/d
+
+# ---------------------------------------------------------------------------
+# Ignored patterns
+# (symbols that contain the pattern are ignored)
+
+# ppc stub
+/\.long_branch\./d
+/\.plt_branch\./d
--
2.41.0.487.g6d72f3e995-goog


2023-07-25 22:03:35

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] scripts/faddr2line: Constrain readelf output to symbols from System.map

On Tue, Jul 25, 2023 at 10:11:57PM +0100, Will Deacon wrote:
> @@ -185,7 +186,7 @@ __faddr2line() {
> found=2
> break
> fi
> - done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> + done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
>
> if [[ $found = 0 ]]; then
> warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"

Looks good, though the outer loop has another readelf incantation:

done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')

It should probably have the same sed options? Also it looks like it's
wrongly checking for FUNC.

--
Josh

2023-07-27 13:14:12

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] scripts/faddr2line: Constrain readelf output to symbols from System.map

On Tue, Jul 25, 2023 at 02:38:05PM -0700, Josh Poimboeuf wrote:
> On Tue, Jul 25, 2023 at 10:11:57PM +0100, Will Deacon wrote:
> > @@ -185,7 +186,7 @@ __faddr2line() {
> > found=2
> > break
> > fi
> > - done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> > + done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> >
> > if [[ $found = 0 ]]; then
> > warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
>
> Looks good, though the outer loop has another readelf incantation:
>
> done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')
>
> It should probably have the same sed options?

Hmm, I don't think it's needed there, is it? The awk expression has a
strict match on $sym_name, which is going to be something extracted from
a kernel log and therefore exists in kallsyms.

> Also it looks like it's wrongly checking for FUNC.

Yes, I agree that should be dropped for the reasons you gave before.

So I can spin a v3, with an extra patch to avoid checking against FUNC.

Will

2023-07-27 17:04:06

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] scripts/faddr2line: Constrain readelf output to symbols from System.map

On Thu, Jul 27, 2023 at 01:18:52PM +0100, Will Deacon wrote:
> On Tue, Jul 25, 2023 at 02:38:05PM -0700, Josh Poimboeuf wrote:
> > On Tue, Jul 25, 2023 at 10:11:57PM +0100, Will Deacon wrote:
> > > @@ -185,7 +186,7 @@ __faddr2line() {
> > > found=2
> > > break
> > > fi
> > > - done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> > > + done < <(${READELF} --symbols --wide $objfile | sed -f ${IGNORED_SYMS} -e 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
> > >
> > > if [[ $found = 0 ]]; then
> > > warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
> >
> > Looks good, though the outer loop has another readelf incantation:
> >
> > done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')
> >
> > It should probably have the same sed options?
>
> Hmm, I don't think it's needed there, is it? The awk expression has a
> strict match on $sym_name, which is going to be something extracted from
> a kernel log and therefore exists in kallsyms.

Yes, I think you're right.

> > Also it looks like it's wrongly checking for FUNC.
>
> Yes, I agree that should be dropped for the reasons you gave before.
>
> So I can spin a v3, with an extra patch to avoid checking against FUNC.

Sounds good, thanks!

--
Josh