When running in qemu on an kernel built with allmodconfig and debug
options (in particular kcov and ubsan) enabled, ftrace_replace_code
function call take minutes. The ftrace selftest calls
ftrace_replace_code to look >40000 through
ftrace_make_call/ftrace_make_nop, and these end up calling
__aarch64_insn_write/aarch64_insn_patch_text_nosync.
Microseconds add up because this is called in a loop for each dyn_ftrace
record, and this triggers the softlockup watchdog unless we let it sleep
occasionally.
Rework so that we call cond_resched() if !irqs_disabled() && !preempt_count().
Suggested-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Anders Roxell <[email protected]>
---
kernel/trace/ftrace.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index c375e33239f7..7080eb464983 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2419,11 +2419,19 @@ void __weak ftrace_replace_code(int enable)
{
struct dyn_ftrace *rec;
struct ftrace_page *pg;
+ bool schedulable;
int failed;
if (unlikely(ftrace_disabled))
return;
+ /*
+ * Some archs calls this function with interrupts or preemption
+ * disabled. However, for other archs that can preempt, this can cause
+ * an tremendous unneeded latency.
+ */
+ schedulable = !irqs_disabled() && !preempt_count();
+
do_for_each_ftrace_rec(pg, rec) {
if (rec->flags & FTRACE_FL_DISABLED)
@@ -2435,6 +2443,8 @@ void __weak ftrace_replace_code(int enable)
/* Stop processing */
return;
}
+ if (schedulable)
+ cond_resched();
} while_for_each_ftrace_rec();
}
--
2.19.2
Hi Anders, Steve,
On Tue, Dec 04, 2018 at 08:29:03PM +0100, Anders Roxell wrote:
> When running in qemu on an kernel built with allmodconfig and debug
> options (in particular kcov and ubsan) enabled, ftrace_replace_code
> function call take minutes. The ftrace selftest calls
> ftrace_replace_code to look >40000 through
> ftrace_make_call/ftrace_make_nop, and these end up calling
> __aarch64_insn_write/aarch64_insn_patch_text_nosync.
>
> Microseconds add up because this is called in a loop for each dyn_ftrace
> record, and this triggers the softlockup watchdog unless we let it sleep
> occasionally.
>
> Rework so that we call cond_resched() if !irqs_disabled() && !preempt_count().
>
> Suggested-by: Steven Rostedt (VMware) <[email protected]>
> Signed-off-by: Anders Roxell <[email protected]>
> ---
> kernel/trace/ftrace.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index c375e33239f7..7080eb464983 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2419,11 +2419,19 @@ void __weak ftrace_replace_code(int enable)
> {
> struct dyn_ftrace *rec;
> struct ftrace_page *pg;
> + bool schedulable;
> int failed;
>
> if (unlikely(ftrace_disabled))
> return;
>
> + /*
> + * Some archs calls this function with interrupts or preemption
> + * disabled. However, for other archs that can preempt, this can cause
> + * an tremendous unneeded latency.
> + */
> + schedulable = !irqs_disabled() && !preempt_count();
Is there a reason not to use preemptible() here?
Will
On Wed, 5 Dec 2018 at 10:54, Will Deacon <[email protected]> wrote:
>
> Hi Anders, Steve,
>
> On Tue, Dec 04, 2018 at 08:29:03PM +0100, Anders Roxell wrote:
> > When running in qemu on an kernel built with allmodconfig and debug
> > options (in particular kcov and ubsan) enabled, ftrace_replace_code
> > function call take minutes. The ftrace selftest calls
> > ftrace_replace_code to look >40000 through
> > ftrace_make_call/ftrace_make_nop, and these end up calling
> > __aarch64_insn_write/aarch64_insn_patch_text_nosync.
> >
> > Microseconds add up because this is called in a loop for each dyn_ftrace
> > record, and this triggers the softlockup watchdog unless we let it sleep
> > occasionally.
> >
> > Rework so that we call cond_resched() if !irqs_disabled() && !preempt_count().
> >
> > Suggested-by: Steven Rostedt (VMware) <[email protected]>
> > Signed-off-by: Anders Roxell <[email protected]>
> > ---
> > kernel/trace/ftrace.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index c375e33239f7..7080eb464983 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -2419,11 +2419,19 @@ void __weak ftrace_replace_code(int enable)
> > {
> > struct dyn_ftrace *rec;
> > struct ftrace_page *pg;
> > + bool schedulable;
> > int failed;
> >
> > if (unlikely(ftrace_disabled))
> > return;
> >
> > + /*
> > + * Some archs calls this function with interrupts or preemption
> > + * disabled. However, for other archs that can preempt, this can cause
> > + * an tremendous unneeded latency.
> > + */
> > + schedulable = !irqs_disabled() && !preempt_count();
>
> Is there a reason not to use preemptible() here?
As I understand it preemptible() is defined to 0 if
CONFIG_PREEMPT_COUNT is disabled.
Thats no good right ?
Cheers,
Anders
On Wed, 5 Dec 2018 11:43:12 +0100
Anders Roxell <[email protected]> wrote:
> > > + schedulable = !irqs_disabled() && !preempt_count();
> >
> > Is there a reason not to use preemptible() here?
>
> As I understand it preemptible() is defined to 0 if
> CONFIG_PREEMPT_COUNT is disabled.
> Thats no good right ?
No it's not, which means this isn't a good approach. I have a much
better idea on how to solve this. I'll post a small patch set in a bit.
-- Steve