2023-07-24 18:34:25

by Will Deacon

[permalink] [raw]
Subject: [PATCH 0/2] Fix 'faddr2line' for LLVM arm64 builds

Hi Josh,

We recently noticed that 'faddr2line' isn't working very well on arm64
Android kernels built with LLVM, so I've hacked up this pair of fixes
which get it back into action. Please take a look!

I suck at shell, so apologies in advance for the patches.

Cheers,

Will

Cc: Josh Poimboeuf <[email protected]>
Cc: John Stultz <[email protected]>

--->8

Will Deacon (2):
scripts/faddr2line: Ignore non-function symbols in readelf output
scripts/faddr2line: Use LLVM addr2line and readelf if LLVM=1

scripts/faddr2line | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

--
2.41.0.487.g6d72f3e995-goog



2023-07-24 18:35:05

by Will Deacon

[permalink] [raw]
Subject: [PATCH 1/2] scripts/faddr2line: Ignore non-function symbols in readelf output

Non-function symbols emitted in the readelf output can confuse the
'faddr2line' symbol size calculation, resulting in the erroneous
rejection of valid offsets. This is especially prevalent when building
an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
prefixed with a 32-bit data value in a '$d.n' section. For example:

447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall
104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73
106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75
111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78
112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79
36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process

Adding a warning to do_one_initcall() results in:

| WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260

Which 'faddr2line' refuses to accept:

$ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
no match for do_one_initcall+0xf4/0x260

Filter out entries from readelf that are not "FUNC" type, so that the
size of a symbol is calculated as a delta to the next function address.

Cc: Josh Poimboeuf <[email protected]>
Cc: John Stultz <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
scripts/faddr2line | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 0e73aca4f908..60368a1cdaed 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -179,7 +179,7 @@ __faddr2line() {
found=2
break
fi
- done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$7 == sec' | sort --key=2)
+ done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v sec=$sym_sec '$4 == "FUNC" && $7 == sec' | sort --key=2)

if [[ $found = 0 ]]; then
warn "can't find symbol: sym_name: $sym_name sym_sec: $sym_sec sym_addr: $sym_addr sym_elf_size: $sym_elf_size"
--
2.41.0.487.g6d72f3e995-goog


2023-07-25 00:54:37

by Josh Poimboeuf

[permalink] [raw]
Subject: Re: [PATCH 1/2] scripts/faddr2line: Ignore non-function symbols in readelf output

On Mon, Jul 24, 2023 at 06:45:16PM +0100, Will Deacon wrote:
> Non-function symbols emitted in the readelf output can confuse the
> 'faddr2line' symbol size calculation, resulting in the erroneous
> rejection of valid offsets. This is especially prevalent when building
> an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> prefixed with a 32-bit data value in a '$d.n' section. For example:
>
> 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall
> 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73
> 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75
> 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78
> 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79
> 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process
>
> Adding a warning to do_one_initcall() results in:
>
> | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
>
> Which 'faddr2line' refuses to accept:
>
> $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> no match for do_one_initcall+0xf4/0x260
>
> Filter out entries from readelf that are not "FUNC" type, so that the
> size of a symbol is calculated as a delta to the next function address.

Problem is, I think the kernel's symbol printing code prints the nearest
kallsyms symbol, and there are some valid non-FUNC code symbols. For
example, syscall_return_via_sysret. This patch would cause those valid
symbols to get missed.

I presume these '$x.*' symbols don't end up in System.map? How do they
get filtered out? By the linker maybe?

We may want to use whatever criteria used there, e.g. have faddr2line
ignore symbols starting with '$' or so?

--
Josh

2023-07-25 18:18:59

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 1/2] scripts/faddr2line: Ignore non-function symbols in readelf output

On Mon, Jul 24, 2023 at 04:47:34PM -0700, Josh Poimboeuf wrote:
> On Mon, Jul 24, 2023 at 06:45:16PM +0100, Will Deacon wrote:
> > Non-function symbols emitted in the readelf output can confuse the
> > 'faddr2line' symbol size calculation, resulting in the erroneous
> > rejection of valid offsets. This is especially prevalent when building
> > an arm64 kernel with CONFIG_CFI_CLANG=y, where most functions are
> > prefixed with a 32-bit data value in a '$d.n' section. For example:
> >
> > 447538: ffff800080014b80 548 FUNC GLOBAL DEFAULT 2 do_one_initcall
> > 104: ffff800080014c74 0 NOTYPE LOCAL DEFAULT 2 $x.73
> > 106: ffff800080014d30 0 NOTYPE LOCAL DEFAULT 2 $x.75
> > 111: ffff800080014da4 0 NOTYPE LOCAL DEFAULT 2 $d.78
> > 112: ffff800080014da8 0 NOTYPE LOCAL DEFAULT 2 $x.79
> > 36: ffff800080014de0 200 FUNC LOCAL DEFAULT 2 run_init_process
> >
> > Adding a warning to do_one_initcall() results in:
> >
> > | WARNING: CPU: 0 PID: 1 at init/main.c:1236 do_one_initcall+0xf4/0x260
> >
> > Which 'faddr2line' refuses to accept:
> >
> > $ ./scripts/faddr2line vmlinux do_one_initcall+0xf4/0x260
> > skipping do_one_initcall address at 0xffff800080014c74 due to size mismatch (0x260 != 0x224)
> > no match for do_one_initcall+0xf4/0x260
> >
> > Filter out entries from readelf that are not "FUNC" type, so that the
> > size of a symbol is calculated as a delta to the next function address.
>
> Problem is, I think the kernel's symbol printing code prints the nearest
> kallsyms symbol, and there are some valid non-FUNC code symbols. For
> example, syscall_return_via_sysret. This patch would cause those valid
> symbols to get missed.

Damn, yes, I can see a few of those in the arm64 vmlinux too. Thanks for
pointing it out.

> I presume these '$x.*' symbols don't end up in System.map? How do they
> get filtered out? By the linker maybe?

See the horrible ad-hoc list of ignored symbol types which get regexed
out by scripts/mksysmap.

> We may want to use whatever criteria used there, e.g. have faddr2line
> ignore symbols starting with '$' or so?

My first hack for this did exactly that (ignore symbols starting with '$'),
but then I tried to find ways to make it a bit more general. Lemme have a
crack at factoring out the stuff from mksysmap before I resort back to
checking for the '$' prefix.

Will

Subject: [tip: objtool/core] scripts/faddr2line: Don't filter out non-function symbols from readelf

The following commit has been merged into the objtool/core branch of tip:

Commit-ID: 180af1a5bdaf8d4964837a46a9fce8c3a7fd2d97
Gitweb: https://git.kernel.org/tip/180af1a5bdaf8d4964837a46a9fce8c3a7fd2d97
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 02 Oct 2023 17:57:47 +01:00
Committer: Josh Poimboeuf <[email protected]>
CommitterDate: Mon, 23 Oct 2023 08:35:01 -07:00

scripts/faddr2line: Don't filter out non-function symbols from readelf

As Josh points out in 20230724234734.zy67gm674vl3p3wv@treble:

> Problem is, I think the kernel's symbol printing code prints the
> nearest kallsyms symbol, and there are some valid non-FUNC code
> symbols. For example, syscall_return_via_sysret.

so we shouldn't be considering only 'FUNC'-type symbols in the output
from readelf.

Drop the function symbol type filtering from the faddr2line outer loop.

Suggested-by: Josh Poimboeuf <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Link: https://lore.kernel.org/r/20230724234734.zy67gm674vl3p3wv@treble
Signed-off-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Josh Poimboeuf <[email protected]>
---
scripts/faddr2line | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/faddr2line b/scripts/faddr2line
index 0e73aca..a35a420 100755
--- a/scripts/faddr2line
+++ b/scripts/faddr2line
@@ -260,7 +260,7 @@ __faddr2line() {

DONE=1

- done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$4 == "FUNC" && $8 == fn')
+ done < <(${READELF} --symbols --wide $objfile | sed 's/\[.*\]//' | ${AWK} -v fn=$sym_name '$8 == fn')
}

[[ $# -lt 2 ]] && usage