2022-02-16 19:06:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: ftrace startup tests crashing due to missing rcu_synchronize()

On Wed, 16 Feb 2022 13:54:19 -0500
Steven Rostedt <[email protected]> wrote:

> That is, shutdown is called, the item is removed from the list and freed,
> but something got preempted while on the ftrace trampoline, with a
> reference to the item, and then woke up and executed the item that was
> freed.
>
> I'll look into it. Thanks for the report.

OK, I wonder if something changed with "is_kernel_core_date()"?

Because on registering, we have:

if (!is_kernel_core_data((unsigned long)ops))
ops->flags |= FTRACE_OPS_FL_DYNAMIC;


and in the shutdown, we have:

/*
* Dynamic ops may be freed, we must make sure that all
* callers are done before leaving this function.
* The same goes for freeing the per_cpu data of the per_cpu
* ops.
*/
if (ops->flags & FTRACE_OPS_FL_DYNAMIC) {
/*
* We need to do a hard force of sched synchronization.
* This is because we use preempt_disable() to do RCU, but
* the function tracers can be called where RCU is not watching
* (like before user_exit()). We can not rely on the RCU
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/
synchronize_rcu_tasks_rude();

/*
* When the kernel is preemptive, tasks can be preempted
* while on a ftrace trampoline. Just scheduling a task on
* a CPU is not good enough to flush them. Calling
* synchronize_rcu_tasks() will wait for those tasks to
* execute and either schedule voluntarily or enter user space.
*/
if (IS_ENABLED(CONFIG_PREEMPTION))
synchronize_rcu_tasks();

free_ops:
ftrace_trampoline_free(ops);
}


If the ops is not flagged as being allocated, or if one of the rcu
synchronizations has changed and allowed for us to continue, then this
would cause what you see.

-- Steve


2022-02-17 06:27:56

by Sven Schnelle

[permalink] [raw]
Subject: Re: ftrace startup tests crashing due to missing rcu_synchronize()

Steven Rostedt <[email protected]> writes:

> On Wed, 16 Feb 2022 13:54:19 -0500
> Steven Rostedt <[email protected]> wrote:
>
>> That is, shutdown is called, the item is removed from the list and freed,
>> but something got preempted while on the ftrace trampoline, with a
>> reference to the item, and then woke up and executed the item that was
>> freed.
>>
>> I'll look into it. Thanks for the report.
>
> OK, I wonder if something changed with "is_kernel_core_date()"?
>
> Because on registering, we have:
>
> if (!is_kernel_core_data((unsigned long)ops))
> ops->flags |= FTRACE_OPS_FL_DYNAMIC;

I checked, and the flag gets set here. I cannot say whether it
is also set when the system crashes, but i would expect it.