2022-12-16 11:26:01

by Anders Roxell

[permalink] [raw]
Subject: BUG: sleeping function called from invalid context at kernel/kallsyms.c:305

Hi,

I'm building an arm64 allmodconfig kernel (see the .config [1]) on
yesterdays next tag 20221215 with KALLSYMS_SELFTEST enabled and I saw
the following bug when the selftest ran:

[29725.015182][ T58] BUG: sleeping function called from invalid
context at kernel/kallsyms.c:305
[29725.022953][ T58] in_atomic(): 0, irqs_disabled(): 128,
non_block: 0, pid: 58, name: kallsyms_test
[29725.031272][ T58] preempt_count: 0, expected: 0
[29725.035903][ T58] RCU nest depth: 0, expected: 0
[29725.040574][ T58] no locks held by kallsyms_test/58.
[29725.045494][ T58] irq event stamp: 18899904
[29725.049809][ T58] hardirqs last enabled at (18899903):
finish_task_switch.isra.0 (core.c:?)
[29725.059608][ T58] hardirqs last disabled at (18899904):
test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
[29725.069936][ T58] softirqs last enabled at (18899886): __do_softirq (??:?)
[29725.078670][ T58] softirqs last disabled at (18899879):
____do_softirq (irq.c:?)
[29725.087399][ T58] CPU: 0 PID: 58 Comm: kallsyms_test Tainted: G
T 6.1.0-next-20221215 #2
0a142be8faea13ac333ed9a1cf4e71b6966ad16e
[29725.099607][ T58] Hardware name: linux,dummy-virt (DT)
[29725.104674][ T58] Call trace:
[29725.107909][ T58] dump_backtrace (??:?)
[29725.112706][ T58] show_stack (??:?)
[29725.116883][ T58] dump_stack_lvl (??:?)
[29725.121666][ T58] dump_stack (??:?)
[29725.125852][ T58] __might_resched (??:?)
[29725.130712][ T58] kallsyms_on_each_symbol (??:?)
[29725.136018][ T58] test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
[29725.142008][ T58] test_entry (kallsyms_selftest.c:?)
[29725.146312][ T58] kthread (kthread.c:?)
[29725.150567][ T58] ret_from_fork (??:?)
[29734.975283][ T58] kallsyms_selftest: kallsyms_on_each_symbol()
traverse all: 5744310840 ns
[29734.992268][ T58] kallsyms_selftest:
kallsyms_on_each_match_symbol() traverse all: 1164580 ns
[29735.049679][ T58] kallsyms_selftest: finish

looks like the issue is that test_perf_kallsyms_on_each_symbol() does
these function calls:

local_irq_save(flags);
t0 = sched_clock();
kallsyms_on_each_match_symbol(match_symbol, stat.name, &stat);
t1 = sched_clock();
local_irq_restore(flags);

and inside kallsyms_on_each_match_symbol(), cond_resched() is called.

Any ideas how to solve this?

Cheers,
Anders
[1] https://people.linaro.org/~anders.roxell/next-20221215.config


2022-12-17 03:50:31

by Zhen Lei

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/kallsyms.c:305



On 2022/12/16 18:45, Anders Roxell wrote:
> Hi,
>
> I'm building an arm64 allmodconfig kernel (see the .config [1]) on
> yesterdays next tag 20221215 with KALLSYMS_SELFTEST enabled and I saw
> the following bug when the selftest ran:
>
> [29725.015182][ T58] BUG: sleeping function called from invalid
> context at kernel/kallsyms.c:305
> [29725.022953][ T58] in_atomic(): 0, irqs_disabled(): 128,
> non_block: 0, pid: 58, name: kallsyms_test
> [29725.031272][ T58] preempt_count: 0, expected: 0
> [29725.035903][ T58] RCU nest depth: 0, expected: 0
> [29725.040574][ T58] no locks held by kallsyms_test/58.
> [29725.045494][ T58] irq event stamp: 18899904
> [29725.049809][ T58] hardirqs last enabled at (18899903):
> finish_task_switch.isra.0 (core.c:?)
> [29725.059608][ T58] hardirqs last disabled at (18899904):
> test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
> [29725.069936][ T58] softirqs last enabled at (18899886): __do_softirq (??:?)
> [29725.078670][ T58] softirqs last disabled at (18899879):
> ____do_softirq (irq.c:?)
> [29725.087399][ T58] CPU: 0 PID: 58 Comm: kallsyms_test Tainted: G
> T 6.1.0-next-20221215 #2
> 0a142be8faea13ac333ed9a1cf4e71b6966ad16e
> [29725.099607][ T58] Hardware name: linux,dummy-virt (DT)
> [29725.104674][ T58] Call trace:
> [29725.107909][ T58] dump_backtrace (??:?)
> [29725.112706][ T58] show_stack (??:?)
> [29725.116883][ T58] dump_stack_lvl (??:?)
> [29725.121666][ T58] dump_stack (??:?)
> [29725.125852][ T58] __might_resched (??:?)
> [29725.130712][ T58] kallsyms_on_each_symbol (??:?)
> [29725.136018][ T58] test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
> [29725.142008][ T58] test_entry (kallsyms_selftest.c:?)
> [29725.146312][ T58] kthread (kthread.c:?)
> [29725.150567][ T58] ret_from_fork (??:?)
> [29734.975283][ T58] kallsyms_selftest: kallsyms_on_each_symbol()
> traverse all: 5744310840 ns
> [29734.992268][ T58] kallsyms_selftest:
> kallsyms_on_each_match_symbol() traverse all: 1164580 ns
> [29735.049679][ T58] kallsyms_selftest: finish
>
> looks like the issue is that test_perf_kallsyms_on_each_symbol() does
> these function calls:
>
> local_irq_save(flags);
> t0 = sched_clock();
> kallsyms_on_each_match_symbol(match_symbol, stat.name, &stat);
> t1 = sched_clock();
> local_irq_restore(flags);
>
> and inside kallsyms_on_each_match_symbol(), cond_resched() is called.
>
> Any ideas how to solve this?

I was initially hoping that the test process would not be interrupted.
It is not a big problem to remove local_irq_save(). From a probabilistic
statistical point of view: it does not affect the horizontal comparison
between kallsyms_on_each_symbol() and kallsyms_on_each_match_symbol(),
and the vertical comparison before and after optimization is also not
affected.

Or do not call cond_resched() during the test. This method seems to be
a little better.

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index e31b4d87a4e1b71..9e07fabaf08e647 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -358,7 +358,8 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
if (ret != 0)
return ret;
- cond_resched();
+ if (!IS_ENABLED(CONFIG_KALLSYMS_SELFTEST))
+ cond_resched();
}
return 0;
}
@@ -375,7 +376,8 @@ int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),

for (i = start; !ret && i <= end; i++) {
ret = fn(data, kallsyms_sym_address(get_symbol_seq(i)));
- cond_resched();
+ if (!IS_ENABLED(CONFIG_KALLSYMS_SELFTEST))
+ cond_resched();
}

return ret;

>
> Cheers,
> Anders
> [1] https://people.linaro.org/~anders.roxell/next-20221215.config
> .
>

--
Regards,
Zhen Lei

2022-12-20 09:04:44

by Anders Roxell

[permalink] [raw]
Subject: Re: BUG: sleeping function called from invalid context at kernel/kallsyms.c:305

On Sat, 17 Dec 2022 at 04:29, Leizhen (ThunderTown)
<[email protected]> wrote:
>
>
>
> On 2022/12/16 18:45, Anders Roxell wrote:
> > Hi,
> >
> > I'm building an arm64 allmodconfig kernel (see the .config [1]) on
> > yesterdays next tag 20221215 with KALLSYMS_SELFTEST enabled and I saw
> > the following bug when the selftest ran:
> >
> > [29725.015182][ T58] BUG: sleeping function called from invalid
> > context at kernel/kallsyms.c:305
> > [29725.022953][ T58] in_atomic(): 0, irqs_disabled(): 128,
> > non_block: 0, pid: 58, name: kallsyms_test
> > [29725.031272][ T58] preempt_count: 0, expected: 0
> > [29725.035903][ T58] RCU nest depth: 0, expected: 0
> > [29725.040574][ T58] no locks held by kallsyms_test/58.
> > [29725.045494][ T58] irq event stamp: 18899904
> > [29725.049809][ T58] hardirqs last enabled at (18899903):
> > finish_task_switch.isra.0 (core.c:?)
> > [29725.059608][ T58] hardirqs last disabled at (18899904):
> > test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
> > [29725.069936][ T58] softirqs last enabled at (18899886): __do_softirq (??:?)
> > [29725.078670][ T58] softirqs last disabled at (18899879):
> > ____do_softirq (irq.c:?)
> > [29725.087399][ T58] CPU: 0 PID: 58 Comm: kallsyms_test Tainted: G
> > T 6.1.0-next-20221215 #2
> > 0a142be8faea13ac333ed9a1cf4e71b6966ad16e
> > [29725.099607][ T58] Hardware name: linux,dummy-virt (DT)
> > [29725.104674][ T58] Call trace:
> > [29725.107909][ T58] dump_backtrace (??:?)
> > [29725.112706][ T58] show_stack (??:?)
> > [29725.116883][ T58] dump_stack_lvl (??:?)
> > [29725.121666][ T58] dump_stack (??:?)
> > [29725.125852][ T58] __might_resched (??:?)
> > [29725.130712][ T58] kallsyms_on_each_symbol (??:?)
> > [29725.136018][ T58] test_perf_kallsyms_on_each_symbol (kallsyms_selftest.c:?)
> > [29725.142008][ T58] test_entry (kallsyms_selftest.c:?)
> > [29725.146312][ T58] kthread (kthread.c:?)
> > [29725.150567][ T58] ret_from_fork (??:?)
> > [29734.975283][ T58] kallsyms_selftest: kallsyms_on_each_symbol()
> > traverse all: 5744310840 ns
> > [29734.992268][ T58] kallsyms_selftest:
> > kallsyms_on_each_match_symbol() traverse all: 1164580 ns
> > [29735.049679][ T58] kallsyms_selftest: finish
> >
> > looks like the issue is that test_perf_kallsyms_on_each_symbol() does
> > these function calls:
> >
> > local_irq_save(flags);
> > t0 = sched_clock();
> > kallsyms_on_each_match_symbol(match_symbol, stat.name, &stat);
> > t1 = sched_clock();
> > local_irq_restore(flags);
> >
> > and inside kallsyms_on_each_match_symbol(), cond_resched() is called.
> >
> > Any ideas how to solve this?
>
> I was initially hoping that the test process would not be interrupted.
> It is not a big problem to remove local_irq_save(). From a probabilistic
> statistical point of view: it does not affect the horizontal comparison
> between kallsyms_on_each_symbol() and kallsyms_on_each_match_symbol(),
> and the vertical comparison before and after optimization is also not
> affected.
>
> Or do not call cond_resched() during the test. This method seems to be
> a little better.

I tried this patch and yes it made the kallsyms selftest to finish.

Please send a patch.

Cheers,
Anders

>
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index e31b4d87a4e1b71..9e07fabaf08e647 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -358,7 +358,8 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
> ret = fn(data, namebuf, NULL, kallsyms_sym_address(i));
> if (ret != 0)
> return ret;
> - cond_resched();
> + if (!IS_ENABLED(CONFIG_KALLSYMS_SELFTEST))
> + cond_resched();
> }
> return 0;
> }
> @@ -375,7 +376,8 @@ int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long),
>
> for (i = start; !ret && i <= end; i++) {
> ret = fn(data, kallsyms_sym_address(get_symbol_seq(i)));
> - cond_resched();
> + if (!IS_ENABLED(CONFIG_KALLSYMS_SELFTEST))
> + cond_resched();
> }
>
> return ret;
>
> >
> > Cheers,
> > Anders
> > [1] https://people.linaro.org/~anders.roxell/next-20221215.config
> > .
> >
>
> --
> Regards,
> Zhen Lei