ftrace_location() was changed to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location after commit aebfd12521d9 ("x86/ibt,ftrace: Search for
__fentry__ location"). That is, if sym+0 location is not __fentry__,
ftrace_location() would find one over the entire size of the sym.
However, there is case that more than one __fentry__ exist in the sym
range (described below) and ftrace_location() would find wrong __fentry__
location by binary searching, which would cause its users like livepatch/
kprobe/bpf to not work properly on this sym!
The case is that, based on current compiler behavior, suppose:
- function A is followed by weak function B1 in same binary file;
- weak function B1 is overridden by function B2;
Then in the final binary file:
- symbol B1 will be removed from symbol table while its instructions are
not removed;
- __fentry__ of B1 will be still in __mcount_loc table;
- function size of A is computed by substracting the symbol address of
A from its next symbol address (see kallsyms_lookup_size_offset()),
but because symbol info of B1 is removed, the next symbol of A is
originally the next symbol of B1. See following example, function
sizeof A will be (symbol_address_C - symbol_address_A):
symbol_address_A
symbol_address_B1 (Not in symbol table)
symbol_address_C
The weak function issue has been discovered in commit b39181f7c690
("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
but it didn't resolve the issue in ftrace_location().
There may be following resolutions:
1. Shrink the search range when __fentry__ is not a sym+0 location,
for example use the macro FTRACE_MCOUNT_MAX_OFFSET. This need every
arch to define its own FTRACE_MCOUNT_MAX_OFFSET:
ftrace_location() {
...
if (!offset)
loc = ftrace_location_range(ip, ip + FTRACE_MCOUNT_MAX_OFFSET + 1);
...
}
2. Define arch-specific arch_ftrace_location() based on its own
different cases of __fentry__ position, for example:
ftrace_location() {
...
if (!offset)
loc = arch_ftrace_location(ip);
...
}
3. Skip __fentry__ of non-override weak function in ftrace_process_locs()
then all records in ftrace_pages are valid. The reason why this scheme
may work is that both __mcount_loc and symbol table are sorted and it
can be assumed that one function has only one __fentry__ location. Then
commit b39181f7c690 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid
adding weak function") can be reverted (not do in this patch). However,
looking up size and offset of every record in __mount_loc table will
slow down system boot and module load.
Solution 1 and 2 need every arch to handle the complex fentry location
case, I use solution 3 as RFC.
Fixes: aebfd12521d9 ("x86/ibt,ftrace: Search for __fentry__ location")
Signed-off-by: Zheng Yejian <[email protected]>
---
include/linux/module.h | 8 ++++++++
kernel/module/kallsyms.c | 23 +++++++++++++++++------
kernel/trace/ftrace.c | 20 +++++++++++++-------
3 files changed, 38 insertions(+), 13 deletions(-)
diff --git a/include/linux/module.h b/include/linux/module.h
index ffa1c603163c..3d5a2165160d 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -954,6 +954,9 @@ unsigned long module_kallsyms_lookup_name(const char *name);
unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name);
+int find_kallsyms_symbol(struct module *mod, unsigned long addr,
+ unsigned long *size, unsigned long *offset);
+
#else /* CONFIG_MODULES && CONFIG_KALLSYMS */
static inline int module_kallsyms_on_each_symbol(const char *modname,
@@ -997,6 +1000,11 @@ static inline unsigned long find_kallsyms_symbol_value(struct module *mod,
return 0;
}
+static inline int find_kallsyms_symbol(struct module *mod, unsigned long addr,
+ unsigned long *size, unsigned long *offset)
+{
+ return 0;
+}
#endif /* CONFIG_MODULES && CONFIG_KALLSYMS */
#endif /* _LINUX_MODULE_H */
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
index 62fb57bb9f16..d70fb4ead794 100644
--- a/kernel/module/kallsyms.c
+++ b/kernel/module/kallsyms.c
@@ -253,10 +253,10 @@ static const char *kallsyms_symbol_name(struct mod_kallsyms *kallsyms, unsigned
* Given a module and address, find the corresponding symbol and return its name
* while providing its size and offset if needed.
*/
-static const char *find_kallsyms_symbol(struct module *mod,
- unsigned long addr,
- unsigned long *size,
- unsigned long *offset)
+static const char *__find_kallsyms_symbol(struct module *mod,
+ unsigned long addr,
+ unsigned long *size,
+ unsigned long *offset)
{
unsigned int i, best = 0;
unsigned long nextval, bestval;
@@ -311,6 +311,17 @@ static const char *find_kallsyms_symbol(struct module *mod,
return kallsyms_symbol_name(kallsyms, best);
}
+int find_kallsyms_symbol(struct module *mod, unsigned long addr,
+ unsigned long *size, unsigned long *offset)
+{
+ const char *ret;
+
+ preempt_disable();
+ ret = __find_kallsyms_symbol(mod, addr, size, offset);
+ preempt_enable();
+ return !!ret;
+}
+
void * __weak dereference_module_function_descriptor(struct module *mod,
void *ptr)
{
@@ -344,7 +355,7 @@ const char *module_address_lookup(unsigned long addr,
#endif
}
- ret = find_kallsyms_symbol(mod, addr, size, offset);
+ ret = __find_kallsyms_symbol(mod, addr, size, offset);
}
/* Make a copy in here where it's safe */
if (ret) {
@@ -367,7 +378,7 @@ int lookup_module_symbol_name(unsigned long addr, char *symname)
if (within_module(addr, mod)) {
const char *sym;
- sym = find_kallsyms_symbol(mod, addr, NULL, NULL);
+ sym = __find_kallsyms_symbol(mod, addr, NULL, NULL);
if (!sym)
goto out;
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 65208d3b5ed9..3c56be753ae8 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -6488,6 +6488,7 @@ static int ftrace_process_locs(struct module *mod,
unsigned long addr;
unsigned long flags = 0; /* Shut up gcc */
int ret = -ENOMEM;
+ unsigned long last_func = 0;
count = end - start;
@@ -6538,6 +6539,8 @@ static int ftrace_process_locs(struct module *mod,
pg = start_pg;
while (p < end) {
unsigned long end_offset;
+ unsigned long cur_func, off;
+
addr = ftrace_call_adjust(*p++);
/*
* Some architecture linkers will pad between
@@ -6549,6 +6552,16 @@ static int ftrace_process_locs(struct module *mod,
skipped++;
continue;
}
+ if (mod)
+ WARN_ON_ONCE(!find_kallsyms_symbol(mod, addr, NULL, &off));
+ else
+ WARN_ON_ONCE(!kallsyms_lookup_size_offset(addr, NULL, &off));
+ cur_func = addr - off;
+ if (cur_func == last_func) {
+ skipped++;
+ continue;
+ }
+ last_func = cur_func;
end_offset = (pg->index+1) * sizeof(pg->records[0]);
if (end_offset > PAGE_SIZE << pg->order) {
@@ -6860,13 +6873,6 @@ void ftrace_module_enable(struct module *mod)
if (!within_module(rec->ip, mod))
break;
- /* Weak functions should still be ignored */
- if (!test_for_valid_rec(rec)) {
- /* Clear all other flags. Should not be enabled anyway */
- rec->flags = FTRACE_FL_DISABLED;
- continue;
- }
-
cnt = 0;
/*
--
2.25.1
On Fri, 7 Jun 2024 17:02:28 +0200
Peter Zijlstra <[email protected]> wrote:
> > There may be following resolutions:
>
> Oh gawd, sodding weak functions again.
>
> I would suggest changing scipts/kallsyms.c to emit readily identifiable
> symbol names for all the weak junk, eg:
>
> __weak_junk_NNNNN
>
> That instantly fixes the immediate problem and Steve's horrid hack can
> go away.
Right. And when I wrote that hack, I specifically said this should be
fixed in kallsyms, and preferably at build time, as that's when the
weak functions should all be resolved.
-- Steve
>
> Additionally, I would add a boot up pass that would INT3 fill all such
> functions and remove/invalidate all
> static_call/static_jump/fentry/alternative entry that is inside of them.
On Fri, Jun 07, 2024 at 07:52:11PM +0800, Zheng Yejian wrote:
> ftrace_location() was changed to not only return the __fentry__ location
> when called for the __fentry__ location, but also when called for the
> sym+0 location after commit aebfd12521d9 ("x86/ibt,ftrace: Search for
> __fentry__ location"). That is, if sym+0 location is not __fentry__,
> ftrace_location() would find one over the entire size of the sym.
>
> However, there is case that more than one __fentry__ exist in the sym
> range (described below) and ftrace_location() would find wrong __fentry__
> location by binary searching, which would cause its users like livepatch/
> kprobe/bpf to not work properly on this sym!
>
> The case is that, based on current compiler behavior, suppose:
> - function A is followed by weak function B1 in same binary file;
> - weak function B1 is overridden by function B2;
> Then in the final binary file:
> - symbol B1 will be removed from symbol table while its instructions are
> not removed;
> - __fentry__ of B1 will be still in __mcount_loc table;
> - function size of A is computed by substracting the symbol address of
> A from its next symbol address (see kallsyms_lookup_size_offset()),
> but because symbol info of B1 is removed, the next symbol of A is
> originally the next symbol of B1. See following example, function
> sizeof A will be (symbol_address_C - symbol_address_A):
>
> symbol_address_A
> symbol_address_B1 (Not in symbol table)
> symbol_address_C
>
> The weak function issue has been discovered in commit b39181f7c690
> ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
> but it didn't resolve the issue in ftrace_location().
>
> There may be following resolutions:
Oh gawd, sodding weak functions again.
I would suggest changing scipts/kallsyms.c to emit readily identifiable
symbol names for all the weak junk, eg:
__weak_junk_NNNNN
That instantly fixes the immediate problem and Steve's horrid hack can
go away.
Additionally, I would add a boot up pass that would INT3 fill all such
functions and remove/invalidate all
static_call/static_jump/fentry/alternative entry that is inside of them.
On 2024/6/7 23:02, Peter Zijlstra wrote:
> On Fri, Jun 07, 2024 at 07:52:11PM +0800, Zheng Yejian wrote:
>> ftrace_location() was changed to not only return the __fentry__ location
>> when called for the __fentry__ location, but also when called for the
>> sym+0 location after commit aebfd12521d9 ("x86/ibt,ftrace: Search for
>> __fentry__ location"). That is, if sym+0 location is not __fentry__,
>> ftrace_location() would find one over the entire size of the sym.
>>
>> However, there is case that more than one __fentry__ exist in the sym
>> range (described below) and ftrace_location() would find wrong __fentry__
>> location by binary searching, which would cause its users like livepatch/
>> kprobe/bpf to not work properly on this sym!
>>
>> The case is that, based on current compiler behavior, suppose:
>> - function A is followed by weak function B1 in same binary file;
>> - weak function B1 is overridden by function B2;
>> Then in the final binary file:
>> - symbol B1 will be removed from symbol table while its instructions are
>> not removed;
>> - __fentry__ of B1 will be still in __mcount_loc table;
>> - function size of A is computed by substracting the symbol address of
>> A from its next symbol address (see kallsyms_lookup_size_offset()),
>> but because symbol info of B1 is removed, the next symbol of A is
>> originally the next symbol of B1. See following example, function
>> sizeof A will be (symbol_address_C - symbol_address_A):
>>
>> symbol_address_A
>> symbol_address_B1 (Not in symbol table)
>> symbol_address_C
>>
>> The weak function issue has been discovered in commit b39181f7c690
>> ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function")
>> but it didn't resolve the issue in ftrace_location().
>>
>> There may be following resolutions:
>
> Oh gawd, sodding weak functions again.
>
> I would suggest changing scipts/kallsyms.c to emit readily identifiable
> symbol names for all the weak junk, eg:
>
> __weak_junk_NNNNN
>
Sorry for the late reply, I just had a long noon holiday :>
scripts/kallsyms.c is compiled and used to handle symbols in vmlinux.o
or vmlinux.a, see kallsyms_step() in scripts/link-vmlinux.sh, those
overridden weak symbols has been removed from symbol table of vmlinux.o
or vmlinux.a. But we can found those symbols from original xx/xx.o file,
for example, the weak free_initmem() in in init/main.c is overridden,
its symbol is not in vmlinx but is still in init/main.o .
How about traversing all origin xx/xx.o and finding all weak junk symbols ?
> That instantly fixes the immediate problem and Steve's horrid hack can
> go away.
>
Yes, this can be done in same patch series.
> Additionally, I would add a boot up pass that would INT3 fill all such
> functions and remove/invalidate all
> static_call/static_jump/fentry/alternative entry that is inside of them.
>
>
>
--
Thanks,
Zheng Yejian
On 2024/6/11 17:21, Peter Zijlstra wrote:
> On Tue, Jun 11, 2024 at 09:56:51AM +0800, Zheng Yejian wrote:
>> On 2024/6/7 23:02, Peter Zijlstra wrote:
>
>>> Oh gawd, sodding weak functions again.
>>>
>>> I would suggest changing scipts/kallsyms.c to emit readily identifiable
>>> symbol names for all the weak junk, eg:
>>>
>>> __weak_junk_NNNNN
>>>
>>
>> Sorry for the late reply, I just had a long noon holiday :>
>>
>> scripts/kallsyms.c is compiled and used to handle symbols in vmlinux.o
>> or vmlinux.a, see kallsyms_step() in scripts/link-vmlinux.sh, those
>> overridden weak symbols has been removed from symbol table of vmlinux.o
>> or vmlinux.a. But we can found those symbols from original xx/xx.o file,
>> for example, the weak free_initmem() in in init/main.c is overridden,
>> its symbol is not in vmlinx but is still in init/main.o .
>>
>> How about traversing all origin xx/xx.o and finding all weak junk symbols ?
>
> You don't need to. ELF symbl tables have an entry size for FUNC type
> objects, this means that you can readily find holes in the text and fill
> them with a symbol.
>
> Specifically, you can check the mcount locations against the symbol
> table and for every one that falls in a hole, generate a new junk
> symbol.
>
> Also see 4adb23686795 where objtool adds these holes to the
> ignore/unreachable code check.
>
>
> The lack of size for kallsyms is in a large part what is causing the
> problems.
Thanks for your suggestions, I'll try it soon.
--
Thanks,
ZYJ
On Tue, Jun 11, 2024 at 09:56:51AM +0800, Zheng Yejian wrote:
> On 2024/6/7 23:02, Peter Zijlstra wrote:
> > Oh gawd, sodding weak functions again.
> >
> > I would suggest changing scipts/kallsyms.c to emit readily identifiable
> > symbol names for all the weak junk, eg:
> >
> > __weak_junk_NNNNN
> >
>
> Sorry for the late reply, I just had a long noon holiday :>
>
> scripts/kallsyms.c is compiled and used to handle symbols in vmlinux.o
> or vmlinux.a, see kallsyms_step() in scripts/link-vmlinux.sh, those
> overridden weak symbols has been removed from symbol table of vmlinux.o
> or vmlinux.a. But we can found those symbols from original xx/xx.o file,
> for example, the weak free_initmem() in in init/main.c is overridden,
> its symbol is not in vmlinx but is still in init/main.o .
>
> How about traversing all origin xx/xx.o and finding all weak junk symbols ?
You don't need to. ELF symbl tables have an entry size for FUNC type
objects, this means that you can readily find holes in the text and fill
them with a symbol.
Specifically, you can check the mcount locations against the symbol
table and for every one that falls in a hole, generate a new junk
symbol.
Also see 4adb23686795 where objtool adds these holes to the
ignore/unreachable code check.
The lack of size for kallsyms is in a large part what is causing the
problems.