2021-06-22 18:40:53

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH] kallsyms: strip LTO suffixes from static functions

Similar to:
commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
functions")

It's very common for compilers to modify the symbol name for static
functions as part of optimizing transformations. That makes hooking
static functions (that weren't inlined or DCE'd) with kprobes difficult.

Full LTO uses a different mangling scheme than thin LTO; full LTO
imports all code into effectively one big translation unit. It must
rename static functions to prevent collisions. Strip off these suffixes
so that we can continue to hook such static functions.

Reported-by: KE.LI(Lieke) <[email protected]>
Tested-by: KE.LI(Lieke) <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
kernel/kallsyms.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 4067564ec59f..14cf3a6474de 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -188,6 +188,24 @@ static inline bool cleanup_symbol_name(char *s)

return res != NULL;
}
+#elif defined(CONFIG_LTO_CLANG_FULL)
+/*
+ * LLVM mangles static functions for full LTO so that two static functions with
+ * the same identifier do not collide when all code is combined into one
+ * module. The scheme used converts references to foo into
+ * foo.llvm.974640843467629774, for example. This can break hooking of static
+ * functions with kprobes.
+ */
+static inline bool cleanup_symbol_name(char *s)
+{
+ char *res;
+
+ res = strstr(s, ".llvm.");
+ if (res)
+ *res = '\0';
+
+ return res != NULL;
+}
#else
static inline bool cleanup_symbol_name(char *s) { return false; }
#endif
--
2.32.0.288.g62a8d224e6-goog


2021-06-22 20:19:18

by Fangrui Song

[permalink] [raw]
Subject: Re: [PATCH] kallsyms: strip LTO suffixes from static functions

On 2021-06-22, 'Nick Desaulniers' via Clang Built Linux wrote:
>Similar to:
>commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
>functions")
>
>It's very common for compilers to modify the symbol name for static
>functions as part of optimizing transformations. That makes hooking
>static functions (that weren't inlined or DCE'd) with kprobes difficult.
>
>Full LTO uses a different mangling scheme than thin LTO; full LTO
>imports all code into effectively one big translation unit. It must
>rename static functions to prevent collisions. Strip off these suffixes
>so that we can continue to hook such static functions.

See below. The message needs a change.

I can comment on the LTO side thing, but a maintainer needs to check
about the kernel side logic.

Reviewed-by: Fangrui Song <[email protected]>

>Reported-by: KE.LI(Lieke) <[email protected]>
>Tested-by: KE.LI(Lieke) <[email protected]>
>Signed-off-by: Nick Desaulniers <[email protected]>
>---
> kernel/kallsyms.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
>diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
>index 4067564ec59f..14cf3a6474de 100644
>--- a/kernel/kallsyms.c
>+++ b/kernel/kallsyms.c
>@@ -188,6 +188,24 @@ static inline bool cleanup_symbol_name(char *s)
>
> return res != NULL;
> }
>+#elif defined(CONFIG_LTO_CLANG_FULL)
>+/*
>+ * LLVM mangles static functions for full LTO so that two static functions with
>+ * the same identifier do not collide when all code is combined into one
>+ * module. The scheme used converts references to foo into
>+ * foo.llvm.974640843467629774, for example. This can break hooking of static
>+ * functions with kprobes.
>+ */

The comment should say ThinLTO instead.

The .llvm.123 suffix is for global scope promotion for local linkage
symbols. The scheme is ThinLTO specific. This ensures that a local
linkage symbol, when imported into multiple translation units, then
compiled into different object files, during linking, the copies can be
deduplicated. This matters for code size and for correctness when the
function address is taken.

Regular LTO (sometimes called full LTO) uses the regular name.\d+
scheme.

>+static inline bool cleanup_symbol_name(char *s)
>+{
>+ char *res;
>+
>+ res = strstr(s, ".llvm.");
>+ if (res)
>+ *res = '\0';
>+
>+ return res != NULL;
>+}
> #else
> static inline bool cleanup_symbol_name(char *s) { return false; }
> #endif
>--
>2.32.0.288.g62a8d224e6-goog

I wonder whether it makes sense to strip all `.something` suffixes.
For example, the recent -funique-internal-linkage-name (which can
improve sample profile accuracy) uses the `.__uniq.1234` scheme.

Function specialization/clones can create arbitrary `.123` suffixes.

>--
>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210622183858.2962637-1-ndesaulniers%40google.com.

2021-06-28 23:39:44

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] kallsyms: strip LTO suffixes from static functions

On Tue, Jun 22, 2021 at 1:18 PM Fangrui Song <[email protected]> wrote:
>
> On 2021-06-22, 'Nick Desaulniers' via Clang Built Linux wrote:
> >Similar to:
> >commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
> >functions")
> >
> >It's very common for compilers to modify the symbol name for static
> >functions as part of optimizing transformations. That makes hooking
> >static functions (that weren't inlined or DCE'd) with kprobes difficult.
> >
> >Full LTO uses a different mangling scheme than thin LTO; full LTO
> >imports all code into effectively one big translation unit. It must
> >rename static functions to prevent collisions. Strip off these suffixes
> >so that we can continue to hook such static functions.
>
> See below. The message needs a change.
>
> I can comment on the LTO side thing, but a maintainer needs to check
> about the kernel side logic.
>
> Reviewed-by: Fangrui Song <[email protected]>
>
> >Reported-by: KE.LI(Lieke) <[email protected]>
> >Tested-by: KE.LI(Lieke) <[email protected]>
> >Signed-off-by: Nick Desaulniers <[email protected]>
> >---
> > kernel/kallsyms.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> >diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> >index 4067564ec59f..14cf3a6474de 100644
> >--- a/kernel/kallsyms.c
> >+++ b/kernel/kallsyms.c
> >@@ -188,6 +188,24 @@ static inline bool cleanup_symbol_name(char *s)
> >
> > return res != NULL;
> > }
> >+#elif defined(CONFIG_LTO_CLANG_FULL)
> >+/*
> >+ * LLVM mangles static functions for full LTO so that two static functions with
> >+ * the same identifier do not collide when all code is combined into one
> >+ * module. The scheme used converts references to foo into
> >+ * foo.llvm.974640843467629774, for example. This can break hooking of static
> >+ * functions with kprobes.
> >+ */
>
> The comment should say ThinLTO instead.
>
> The .llvm.123 suffix is for global scope promotion for local linkage
> symbols. The scheme is ThinLTO specific. This ensures that a local

Oh, boy. Indeed. I had identified the mangling coming from
getGlobalNameForLocal(), but looking at the call chain now I see:

FunctionImportGlobalProcessing::processGlobalForThinLTO()
-> FunctionImportGlobalProcessing::getPromotedName()
-> ModuleSummaryIndex::getGlobalNameForLocal()

I'm not sure then how I figured it was specific to full LTO.

Android recently switched from thin LTO to full LTO, which is what I
assumed was the cause of the bug report. Rereading our internal bug
report, it was tested against a prior version that did the symbol
truncation for thinLTO. I then assumed this was full LTO specific for
whatever reason, and modified the patch to only apply to full LTO. I
see via the above call chain that this patch is not correct. Let me
send my original patch as a v2. b/189560201 if you're interested.

> linkage symbol, when imported into multiple translation units, then
> compiled into different object files, during linking, the copies can be
> deduplicated. This matters for code size and for correctness when the
> function address is taken.
>
> Regular LTO (sometimes called full LTO) uses the regular name.\d+
> scheme.
>
> >+static inline bool cleanup_symbol_name(char *s)
> >+{
> >+ char *res;
> >+
> >+ res = strstr(s, ".llvm.");
> >+ if (res)
> >+ *res = '\0';
> >+
> >+ return res != NULL;
> >+}
> > #else
> > static inline bool cleanup_symbol_name(char *s) { return false; }
> > #endif
> >--
> >2.32.0.288.g62a8d224e6-goog
>
> I wonder whether it makes sense to strip all `.something` suffixes.
> For example, the recent -funique-internal-linkage-name (which can
> improve sample profile accuracy) uses the `.__uniq.1234` scheme.
>
> Function specialization/clones can create arbitrary `.123` suffixes.

I definitely don't see hooking static functions via kprobes as being
scalable. There are numerous different mangling schemes different
compilers apply to different static functions.

--
Thanks,
~Nick Desaulniers

2021-06-28 23:40:07

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH] kallsyms: strip LTO suffixes from static functions

On Mon, Jun 28, 2021 at 10:54 AM Nick Desaulniers
<[email protected]> wrote:
>
> On Tue, Jun 22, 2021 at 1:18 PM Fangrui Song <[email protected]> wrote:
> >
> > On 2021-06-22, 'Nick Desaulniers' via Clang Built Linux wrote:
> > >+/*
> > >+ * LLVM mangles static functions for full LTO so that two static functions with
> > >+ * the same identifier do not collide when all code is combined into one
> > >+ * module. The scheme used converts references to foo into
> > >+ * foo.llvm.974640843467629774, for example. This can break hooking of static
> > >+ * functions with kprobes.
> > >+ */
> >
> > The comment should say ThinLTO instead.
> >
> > The .llvm.123 suffix is for global scope promotion for local linkage
> > symbols. The scheme is ThinLTO specific. This ensures that a local
>
> Oh, boy. Indeed. I had identified the mangling coming from
> getGlobalNameForLocal(), but looking at the call chain now I see:
>
> FunctionImportGlobalProcessing::processGlobalForThinLTO()
> -> FunctionImportGlobalProcessing::getPromotedName()
> -> ModuleSummaryIndex::getGlobalNameForLocal()
>
> I'm not sure then how I figured it was specific to full LTO.
>
> Android recently switched from thin LTO to full LTO, which is what I
> assumed was the cause of the bug report. Rereading our internal bug
> report, it was tested against a prior version that did the symbol
> truncation for thinLTO. I then assumed this was full LTO specific for
> whatever reason, and modified the patch to only apply to full LTO. I
> see via the above call chain that this patch is not correct. Let me
> send my original patch as a v2. b/189560201 if you're interested.

I can even see the .llvm.<number> symbol names via `llvm-nm` on
vmlinux for thinLTO builds. No such symbols exist for full LTO.

--
Thanks,
~Nick Desaulniers

2021-06-28 23:41:30

by Nick Desaulniers

[permalink] [raw]
Subject: [PATCH v2] kallsyms: strip LTO suffixes from static functions

Similar to:
commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
functions")

It's very common for compilers to modify the symbol name for static
functions as part of optimizing transformations. That makes hooking
static functions (that weren't inlined or DCE'd) with kprobes difficult.

LLVM has yet another name mangling scheme used by thin LTO. Strip off
these suffixes so that we can continue to hook such static functions.

Reported-by: KE.LI(Lieke) <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
---
Changes v1 -> v2:
* Both mangling schemes can occur for thinLTO + CFI, this new scheme can
also occur for thinLTO without CFI. Split cleanup_symbol_name() into
two function calls.
* Drop KE.LI's tested by tag.
* Do not carry Fangrui's Reviewed by tag.
* Drop the inline keyword; it is meaningless.

kernel/kallsyms.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 4067564ec59f..fbce4a1ec700 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -171,14 +171,30 @@ static unsigned long kallsyms_sym_address(int idx)
return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
}

-#if defined(CONFIG_CFI_CLANG) && defined(CONFIG_LTO_CLANG_THIN)
+#ifdef CONFIG_LTO_CLANG_THIN
+/*
+ * LLVM appends a suffix for local variables that must be promoted to global
+ * scope as part of thin LTO. foo() becomes foo.llvm.974640843467629774. This
+ * can break hooking of static functions with kprobes.
+ */
+static bool cleanup_symbol_name_thinlto(char *s)
+{
+ char *res;
+
+ res = strstr(s, ".llvm.");
+ if (res)
+ *res = '\0';
+
+ return res != NULL;
+}
+#ifdef CONFIG_CFI_CLANG
/*
* LLVM appends a hash to static function names when ThinLTO and CFI are
* both enabled, i.e. foo() becomes foo$707af9a22804d33c81801f27dcfe489b.
* This causes confusion and potentially breaks user space tools, so we
* strip the suffix from expanded symbol names.
*/
-static inline bool cleanup_symbol_name(char *s)
+static bool cleanup_symbol_name_thinlto_cfi(char *s)
{
char *res;

@@ -189,8 +205,17 @@ static inline bool cleanup_symbol_name(char *s)
return res != NULL;
}
#else
-static inline bool cleanup_symbol_name(char *s) { return false; }
-#endif
+static bool cleanup_symbol_name_thinlto_cfi(char *s) { return false; }
+#endif /* CONFIG_CFI_CLANG */
+#else
+static bool cleanup_symbol_name_thinlto(char *s) { return false; }
+#endif /* CONFIG_LTO_CLANG_THIN */
+
+static bool cleanup_symbol_name(char *s)
+{
+ return cleanup_symbol_name_thinlto(s) &&
+ cleanup_symbol_name_thinlto_cfi(s);
+}

/* Lookup the address for this symbol. Returns 0 if not found. */
unsigned long kallsyms_lookup_name(const char *name)
--
2.32.0.93.g670b81a890-goog

2021-06-29 00:22:58

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v2] kallsyms: strip LTO suffixes from static functions

On 6/28/2021 12:05 PM, 'Nick Desaulniers' via Clang Built Linux wrote:
> Similar to:
> commit 8b8e6b5d3b01 ("kallsyms: strip ThinLTO hashes from static
> functions")
>
> It's very common for compilers to modify the symbol name for static
> functions as part of optimizing transformations. That makes hooking
> static functions (that weren't inlined or DCE'd) with kprobes difficult.
>
> LLVM has yet another name mangling scheme used by thin LTO. Strip off
> these suffixes so that we can continue to hook such static functions.
>
> Reported-by: KE.LI(Lieke) <[email protected]>
> Signed-off-by: Nick Desaulniers <[email protected]>
> ---
> Changes v1 -> v2:
> * Both mangling schemes can occur for thinLTO + CFI, this new scheme can
> also occur for thinLTO without CFI. Split cleanup_symbol_name() into
> two function calls.
> * Drop KE.LI's tested by tag.
> * Do not carry Fangrui's Reviewed by tag.
> * Drop the inline keyword; it is meaningless.
>
> kernel/kallsyms.c | 33 +++++++++++++++++++++++++++++----
> 1 file changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 4067564ec59f..fbce4a1ec700 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -171,14 +171,30 @@ static unsigned long kallsyms_sym_address(int idx)
> return kallsyms_relative_base - 1 - kallsyms_offsets[idx];
> }
>
> -#if defined(CONFIG_CFI_CLANG) && defined(CONFIG_LTO_CLANG_THIN)
> +#ifdef CONFIG_LTO_CLANG_THIN
> +/*
> + * LLVM appends a suffix for local variables that must be promoted to global
> + * scope as part of thin LTO. foo() becomes foo.llvm.974640843467629774. This
> + * can break hooking of static functions with kprobes.
> + */
> +static bool cleanup_symbol_name_thinlto(char *s)
> +{
> + char *res;
> +
> + res = strstr(s, ".llvm.");
> + if (res)
> + *res = '\0';
> +
> + return res != NULL;
> +}
> +#ifdef CONFIG_CFI_CLANG
> /*
> * LLVM appends a hash to static function names when ThinLTO and CFI are
> * both enabled, i.e. foo() becomes foo$707af9a22804d33c81801f27dcfe489b.
> * This causes confusion and potentially breaks user space tools, so we
> * strip the suffix from expanded symbol names.
> */
> -static inline bool cleanup_symbol_name(char *s)
> +static bool cleanup_symbol_name_thinlto_cfi(char *s)
> {
> char *res;
>
> @@ -189,8 +205,17 @@ static inline bool cleanup_symbol_name(char *s)
> return res != NULL;
> }
> #else
> -static inline bool cleanup_symbol_name(char *s) { return false; }
> -#endif
> +static bool cleanup_symbol_name_thinlto_cfi(char *s) { return false; }
> +#endif /* CONFIG_CFI_CLANG */
> +#else
> +static bool cleanup_symbol_name_thinlto(char *s) { return false; }
> +#endif /* CONFIG_LTO_CLANG_THIN */
> +
> +static bool cleanup_symbol_name(char *s)
> +{
> + return cleanup_symbol_name_thinlto(s) &&
> + cleanup_symbol_name_thinlto_cfi(s);

Won't this be a build error when CONFIG_LTO_CLANG_THIN=n and
CONFIG_CFI_CLANG=n because cleanup_symbol_name_thinlto_cfi() will not be
defined? Should the cleanup_symbol_name_thinlto_cfi() stub be in the
last else block?

Cheers,
Nathan

> +}
>
> /* Lookup the address for this symbol. Returns 0 if not found. */
> unsigned long kallsyms_lookup_name(const char *name)
>