2015-12-14 02:59:08

by Qiu, PeiyangX

[permalink] [raw]
Subject: [PATCH] ftrace: fix race between ftrace and insmod

We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
Basically, there is a race between insmod and ftrace_run_update_code.

After load_module=>ftrace_module_init, another thread jumps in to call
ftrace_run_update_code=>ftrace_arch_code_modify_prepare
=>set_all_modules_text_rw, to change all modules
as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
is not changed. Then, the 2nd thread goes ahead to change codes.
However, load_module continues to call
complete_formation=>set_section_ro_nx,
then 2nd thread would fail when probing the module's TEXT.

Below patch tries to resolve it.

commit a949ae560a511fe4e3adf48fa44fefded93e5c2b
Author: Steven Rostedt (Red Hat) <[email protected]>
Date: Thu Apr 24 10:40:12 2014 -0400

ftrace/module: Hardcode ftrace_module_init() call into load_module()

But it can't fully resolve the issue.

THis patch holds ftrace_mutex across ftrace_module_init and
complete_formation.

Signed-off-by: Qiu Peiyang <[email protected]>
Signed-off-by: Zhang Yanmin <[email protected]>
---
include/linux/ftrace.h | 6 ++++--
kernel/module.c | 3 ++-
kernel/trace/ftrace.c | 33 ++++++++++++++++++++++-----------
3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index eae6548..4adc279 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -585,7 +585,8 @@ static inline int ftrace_modify_call(struct
dyn_ftrace *rec, unsigned long old_a
extern int ftrace_arch_read_dyn_info(char *buf, int size);

extern int skip_trace(unsigned long ip);
-extern void ftrace_module_init(struct module *mod);
+extern void ftrace_module_init_start(struct module *mod);
+extern void ftrace_module_init_end(struct module *mod);

extern void ftrace_disable_daemon(void);
extern void ftrace_enable_daemon(void);
@@ -595,7 +596,8 @@ static inline int ftrace_force_update(void) { return
0; }
static inline void ftrace_disable_daemon(void) { }
static inline void ftrace_enable_daemon(void) { }
static inline void ftrace_release_mod(struct module *mod) {}
-static inline void ftrace_module_init(struct module *mod) {}
+static inline void ftrace_module_init_start(struct module *mod) {}
+static inline void ftrace_module_init_end(struct module *mod) {}
static inline __init int register_ftrace_command(struct
ftrace_func_command *cmd)
{
return -EINVAL;
diff --git a/kernel/module.c b/kernel/module.c
index 8f051a1..324c5c6 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3506,10 +3506,11 @@ static int load_module(struct load_info *info,
const char __user *uargs,
dynamic_debug_setup(info->debug, info->num_debug);

/* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
- ftrace_module_init(mod);
+ ftrace_module_init_start(mod);

/* Finally it's fully formed, ready to start executing. */
err = complete_formation(mod, info);
+ ftrace_module_init_end(mod);
if (err)
goto ddebug_cleanup;

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3f743b1..436c199 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4795,7 +4795,7 @@ static int ftrace_cmp_ips(const void *a, const
void *b)
return 0;
}

-static int ftrace_process_locs(struct module *mod,
+static int ftrace_process_locs_nolock(struct module *mod,
unsigned long *start,
unsigned long *end)
{
@@ -4820,8 +4820,6 @@ static int ftrace_process_locs(struct module *mod,
if (!start_pg)
return -ENOMEM;

- mutex_lock(&ftrace_lock);
-
/*
* Core and each module needs their own pages, as
* modules will free them when they are removed.
@@ -4889,8 +4887,18 @@ static int ftrace_process_locs(struct module *mod,
local_irq_restore(flags);
ret = 0;
out:
- mutex_unlock(&ftrace_lock);
+ return ret;
+}

+static int ftrace_process_locs(struct module *mod,
+ unsigned long *start,
+ unsigned long *end)
+{
+ int ret;
+
+ mutex_lock(&ftrace_lock);
+ ret = ftrace_process_locs_nolock(mod, start, end);
+ mutex_unlock(&ftrace_lock);
return ret;
}

@@ -4940,19 +4948,22 @@ void ftrace_release_mod(struct module *mod)
mutex_unlock(&ftrace_lock);
}

-static void ftrace_init_module(struct module *mod,
- unsigned long *start, unsigned long *end)
+void ftrace_module_init_start(struct module *mod)
{
+ unsigned long *start = mod->ftrace_callsites;
+ unsigned long *end = mod->ftrace_callsites + mod->num_ftrace_callsites;
+
+ mutex_lock(&ftrace_lock);
+
if (ftrace_disabled || start == end)
return;
- ftrace_process_locs(mod, start, end);
+
+ ftrace_process_locs_nolock(mod, start, end);
}

-void ftrace_module_init(struct module *mod)
+void ftrace_module_init_end(struct module *mod)
{
- ftrace_init_module(mod, mod->ftrace_callsites,
- mod->ftrace_callsites +
- mod->num_ftrace_callsites);
+ mutex_unlock(&ftrace_lock);
}

static int ftrace_module_notify_exit(struct notifier_block *self,
--
1.9.1


2015-12-14 03:16:23

by Qiu, PeiyangX

[permalink] [raw]
Subject: [PATCH] ftrace: fix race between ftrace and insmod

We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
Basically, there is a race between insmod and ftrace_run_update_code.

After load_module=>ftrace_module_init, another thread jumps in to call
ftrace_run_update_code=>ftrace_arch_code_modify_prepare
=>set_all_modules_text_rw, to change all modules
as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
is not changed. Then, the 2nd thread goes ahead to change codes.
However, load_module continues to call
complete_formation=>set_section_ro_nx,
then 2nd thread would fail when probing the module's TEXT.

Below patch tries to resolve it.

commit a949ae560a511fe4e3adf48fa44fefded93e5c2b
Author: Steven Rostedt (Red Hat) <[email protected]>
Date: Thu Apr 24 10:40:12 2014 -0400

ftrace/module: Hardcode ftrace_module_init() call into load_module()

But it can't fully resolve the issue.

THis patch holds ftrace_mutex across ftrace_module_init and
complete_formation.

Signed-off-by: Qiu Peiyang <[email protected]>
Signed-off-by: Zhang Yanmin <[email protected]>
---
include/linux/ftrace.h | 6 ++++--
kernel/module.c | 3 ++-
kernel/trace/ftrace.c | 33 ++++++++++++++++++++++-----------
3 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index eae6548..4adc279 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -585,7 +585,8 @@ static inline int ftrace_modify_call(struct
dyn_ftrace *rec, unsigned long old_a
extern int ftrace_arch_read_dyn_info(char *buf, int size);

extern int skip_trace(unsigned long ip);
-extern void ftrace_module_init(struct module *mod);
+extern void ftrace_module_init_start(struct module *mod);
+extern void ftrace_module_init_end(struct module *mod);

extern void ftrace_disable_daemon(void);
extern void ftrace_enable_daemon(void);
@@ -595,7 +596,8 @@ static inline int ftrace_force_update(void) { return
0; }
static inline void ftrace_disable_daemon(void) { }
static inline void ftrace_enable_daemon(void) { }
static inline void ftrace_release_mod(struct module *mod) {}
-static inline void ftrace_module_init(struct module *mod) {}
+static inline void ftrace_module_init_start(struct module *mod) {}
+static inline void ftrace_module_init_end(struct module *mod) {}
static inline __init int register_ftrace_command(struct
ftrace_func_command *cmd)
{
return -EINVAL;
diff --git a/kernel/module.c b/kernel/module.c
index 8f051a1..324c5c6 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3506,10 +3506,11 @@ static int load_module(struct load_info *info,
const char __user *uargs,
dynamic_debug_setup(info->debug, info->num_debug);

/* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
- ftrace_module_init(mod);
+ ftrace_module_init_start(mod);

/* Finally it's fully formed, ready to start executing. */
err = complete_formation(mod, info);
+ ftrace_module_init_end(mod);
if (err)
goto ddebug_cleanup;

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 3f743b1..436c199 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4795,7 +4795,7 @@ static int ftrace_cmp_ips(const void *a, const
void *b)
return 0;
}

-static int ftrace_process_locs(struct module *mod,
+static int ftrace_process_locs_nolock(struct module *mod,
unsigned long *start,
unsigned long *end)
{
@@ -4820,8 +4820,6 @@ static int ftrace_process_locs(struct module *mod,
if (!start_pg)
return -ENOMEM;

- mutex_lock(&ftrace_lock);
-
/*
* Core and each module needs their own pages, as
* modules will free them when they are removed.
@@ -4889,8 +4887,18 @@ static int ftrace_process_locs(struct module *mod,
local_irq_restore(flags);
ret = 0;
out:
- mutex_unlock(&ftrace_lock);
+ return ret;
+}

+static int ftrace_process_locs(struct module *mod,
+ unsigned long *start,
+ unsigned long *end)
+{
+ int ret;
+
+ mutex_lock(&ftrace_lock);
+ ret = ftrace_process_locs_nolock(mod, start, end);
+ mutex_unlock(&ftrace_lock);
return ret;
}

@@ -4940,19 +4948,22 @@ void ftrace_release_mod(struct module *mod)
mutex_unlock(&ftrace_lock);
}

-static void ftrace_init_module(struct module *mod,
- unsigned long *start, unsigned long *end)
+void ftrace_module_init_start(struct module *mod)
{
+ unsigned long *start = mod->ftrace_callsites;
+ unsigned long *end = mod->ftrace_callsites + mod->num_ftrace_callsites;
+
+ mutex_lock(&ftrace_lock);
+
if (ftrace_disabled || start == end)
return;
- ftrace_process_locs(mod, start, end);
+
+ ftrace_process_locs_nolock(mod, start, end);
}

-void ftrace_module_init(struct module *mod)
+void ftrace_module_init_end(struct module *mod)
{
- ftrace_init_module(mod, mod->ftrace_callsites,
- mod->ftrace_callsites +
- mod->num_ftrace_callsites);
+ mutex_unlock(&ftrace_lock);
}

static int ftrace_module_notify_exit(struct notifier_block *self,
--
1.9.1

2015-12-14 15:51:33

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On Mon, 14 Dec 2015 11:16:18 +0800
"Qiu, PeiyangX" <[email protected]> wrote:

> We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
> Basically, there is a race between insmod and ftrace_run_update_code.
>
> After load_module=>ftrace_module_init, another thread jumps in to call
> ftrace_run_update_code=>ftrace_arch_code_modify_prepare
> =>set_all_modules_text_rw, to change all modules
> as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
> is not changed. Then, the 2nd thread goes ahead to change codes.
> However, load_module continues to call
> complete_formation=>set_section_ro_nx,
> then 2nd thread would fail when probing the module's TEXT.
>
> Below patch tries to resolve it.
>
> commit a949ae560a511fe4e3adf48fa44fefded93e5c2b
> Author: Steven Rostedt (Red Hat) <[email protected]>
> Date: Thu Apr 24 10:40:12 2014 -0400
>
> ftrace/module: Hardcode ftrace_module_init() call into load_module()
>
> But it can't fully resolve the issue.
>
> THis patch holds ftrace_mutex across ftrace_module_init and
> complete_formation.
>
> Signed-off-by: Qiu Peiyang <[email protected]>
> Signed-off-by: Zhang Yanmin <[email protected]>

First, this patch has major whitespace damage. All tabs are now spaces!

> ---
> include/linux/ftrace.h | 6 ++++--
> kernel/module.c | 3 ++-
> kernel/trace/ftrace.c | 33 ++++++++++++++++++++++-----------
> 3 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index eae6548..4adc279 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -585,7 +585,8 @@ static inline int ftrace_modify_call(struct
> dyn_ftrace *rec, unsigned long old_a
> extern int ftrace_arch_read_dyn_info(char *buf, int size);
>
> extern int skip_trace(unsigned long ip);
> -extern void ftrace_module_init(struct module *mod);
> +extern void ftrace_module_init_start(struct module *mod);
> +extern void ftrace_module_init_end(struct module *mod);
>
> extern void ftrace_disable_daemon(void);
> extern void ftrace_enable_daemon(void);
> @@ -595,7 +596,8 @@ static inline int ftrace_force_update(void) { return
> 0; }
> static inline void ftrace_disable_daemon(void) { }
> static inline void ftrace_enable_daemon(void) { }
> static inline void ftrace_release_mod(struct module *mod) {}
> -static inline void ftrace_module_init(struct module *mod) {}
> +static inline void ftrace_module_init_start(struct module *mod) {}
> +static inline void ftrace_module_init_end(struct module *mod) {}
> static inline __init int register_ftrace_command(struct
> ftrace_func_command *cmd)
> {
> return -EINVAL;
> diff --git a/kernel/module.c b/kernel/module.c
> index 8f051a1..324c5c6 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3506,10 +3506,11 @@ static int load_module(struct load_info *info,
> const char __user *uargs,
> dynamic_debug_setup(info->debug, info->num_debug);
>
> /* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
> - ftrace_module_init(mod);
> + ftrace_module_init_start(mod);
>
> /* Finally it's fully formed, ready to start executing. */
> err = complete_formation(mod, info);
> + ftrace_module_init_end(mod);

Instead of modifying the module code even more, why not just put this
into the module notifier when done?

Then this would only need to change the ftrace code. Although, holding
a lock throughout seems rather hacky.



> if (err)
> goto ddebug_cleanup;
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 3f743b1..436c199 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -4795,7 +4795,7 @@ static int ftrace_cmp_ips(const void *a, const
> void *b)
> return 0;
> }
>
> -static int ftrace_process_locs(struct module *mod,
> +static int ftrace_process_locs_nolock(struct module *mod,

Note, the usually method for functions in the tracing code that do not
take locks is to simply prepend a double underscore (__) in front of
the function name: __ftrace_process_locs()


> unsigned long *start,
> unsigned long *end)
> {
> @@ -4820,8 +4820,6 @@ static int ftrace_process_locs(struct module *mod,
> if (!start_pg)
> return -ENOMEM;
>
> - mutex_lock(&ftrace_lock);
> -
> /*
> * Core and each module needs their own pages, as
> * modules will free them when they are removed.
> @@ -4889,8 +4887,18 @@ static int ftrace_process_locs(struct module *mod,
> local_irq_restore(flags);
> ret = 0;
> out:
> - mutex_unlock(&ftrace_lock);
> + return ret;
> +}

We can remove the out, and all the "goto out" can be changed to return.

>
> +static int ftrace_process_locs(struct module *mod,
> + unsigned long *start,
> + unsigned long *end)
> +{
> + int ret;
> +
> + mutex_lock(&ftrace_lock);
> + ret = ftrace_process_locs_nolock(mod, start, end);
> + mutex_unlock(&ftrace_lock);
> return ret;
> }
>
> @@ -4940,19 +4948,22 @@ void ftrace_release_mod(struct module *mod)
> mutex_unlock(&ftrace_lock);
> }
>
> -static void ftrace_init_module(struct module *mod,
> - unsigned long *start, unsigned long *end)
> +void ftrace_module_init_start(struct module *mod)
> {
> + unsigned long *start = mod->ftrace_callsites;
> + unsigned long *end = mod->ftrace_callsites + mod->num_ftrace_callsites;
> +
> + mutex_lock(&ftrace_lock);
> +
> if (ftrace_disabled || start == end)
> return;
> - ftrace_process_locs(mod, start, end);
> +
> + ftrace_process_locs_nolock(mod, start, end);
> }
>
> -void ftrace_module_init(struct module *mod)
> +void ftrace_module_init_end(struct module *mod)
> {
> - ftrace_init_module(mod, mod->ftrace_callsites,
> - mod->ftrace_callsites +
> - mod->num_ftrace_callsites);
> + mutex_unlock(&ftrace_lock);
> }
>
> static int ftrace_module_notify_exit(struct notifier_block *self,

This seems very hackish, although I can't think of a better way at the
moment. But I would like not to add more code into module.c if
possible, and just use a notifier unless there's a real reason we can't
(as there was when we added the hook in the first place).

-- Steve

2015-12-15 01:05:36

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On 2015/12/14 23:51, Steven Rostedt wrote:
> On Mon, 14 Dec 2015 11:16:18 +0800
> "Qiu, PeiyangX" <[email protected]> wrote:
>
>> We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
>> Basically, there is a race between insmod and ftrace_run_update_code.
>>
>> After load_module=>ftrace_module_init, another thread jumps in to call
>> ftrace_run_update_code=>ftrace_arch_code_modify_prepare
>> =>set_all_modules_text_rw, to change all modules
>> as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
>> is not changed. Then, the 2nd thread goes ahead to change codes.
>> However, load_module continues to call
>> complete_formation=>set_section_ro_nx,
>> then 2nd thread would fail when probing the module's TEXT.
>>
>> Below patch tries to resolve it.
>>
>> commit a949ae560a511fe4e3adf48fa44fefded93e5c2b
>> Author: Steven Rostedt (Red Hat) <[email protected]>
>> Date: Thu Apr 24 10:40:12 2014 -0400
>>
>> ftrace/module: Hardcode ftrace_module_init() call into load_module()
>>
>> But it can't fully resolve the issue.
>>
>> THis patch holds ftrace_mutex across ftrace_module_init and
>> complete_formation.
>>
>> Signed-off-by: Qiu Peiyang <[email protected]>
>> Signed-off-by: Zhang Yanmin <[email protected]>
> First, this patch has major whitespace damage. All tabs are now spaces!

Sorry. That's really bad. It's the 1st time for Peiyang to send patch to LKML.
He would fix such issue by reconfiguring email client.

>
>> ---
>> include/linux/ftrace.h | 6 ++++--
>> kernel/module.c | 3 ++-
>> kernel/trace/ftrace.c | 33 ++++++++++++++++++++++-----------
>> 3 files changed, 28 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
>> index eae6548..4adc279 100644
>> --- a/include/linux/ftrace.h
>> +++ b/include/linux/ftrace.h
>> @@ -585,7 +585,8 @@ static inline int ftrace_modify_call(struct
>> dyn_ftrace *rec, unsigned long old_a
>> extern int ftrace_arch_read_dyn_info(char *buf, int size);
>>
>> extern int skip_trace(unsigned long ip);
>> -extern void ftrace_module_init(struct module *mod);
>> +extern void ftrace_module_init_start(struct module *mod);
>> +extern void ftrace_module_init_end(struct module *mod);
>>
>> extern void ftrace_disable_daemon(void);
>> extern void ftrace_enable_daemon(void);
>> @@ -595,7 +596,8 @@ static inline int ftrace_force_update(void) { return
>> 0; }
>> static inline void ftrace_disable_daemon(void) { }
>> static inline void ftrace_enable_daemon(void) { }
>> static inline void ftrace_release_mod(struct module *mod) {}
>> -static inline void ftrace_module_init(struct module *mod) {}
>> +static inline void ftrace_module_init_start(struct module *mod) {}
>> +static inline void ftrace_module_init_end(struct module *mod) {}
>> static inline __init int register_ftrace_command(struct
>> ftrace_func_command *cmd)
>> {
>> return -EINVAL;
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 8f051a1..324c5c6 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -3506,10 +3506,11 @@ static int load_module(struct load_info *info,
>> const char __user *uargs,
>> dynamic_debug_setup(info->debug, info->num_debug);
>>
>> /* Ftrace init must be called in the MODULE_STATE_UNFORMED state */
>> - ftrace_module_init(mod);
>> + ftrace_module_init_start(mod);
>>
>> /* Finally it's fully formed, ready to start executing. */
>> err = complete_formation(mod, info);
>> + ftrace_module_init_end(mod);
> Instead of modifying the module code even more, why not just put this
> into the module notifier when done?
>
> Then this would only need to change the ftrace code. Although, holding
> a lock throughout seems rather hacky.

We did consider notifier. When ftrace_module_init is called, start_pg of the
module is linked to ftrace_pages. To fix the issue, we have to either add a
new var to struct module to save its start_pg, or change ftrace to save
module start_pg before MODULE_STATE_COMING notification is delivered.
In addition, function complete_formation need consider cleanup the
start_pg if init fails.

>
>
>
>> if (err)
>> goto ddebug_cleanup;
>>
>> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
>> index 3f743b1..436c199 100644
>> --- a/kernel/trace/ftrace.c
>> +++ b/kernel/trace/ftrace.c
>> @@ -4795,7 +4795,7 @@ static int ftrace_cmp_ips(const void *a, const
>> void *b)
>> return 0;
>> }
>>
>> -static int ftrace_process_locs(struct module *mod,
>> +static int ftrace_process_locs_nolock(struct module *mod,
> Note, the usually method for functions in the tracing code that do not
> take locks is to simply prepend a double underscore (__) in front of
> the function name: __ftrace_process_locs()

Yes. We would change the function name.

>
>
>> unsigned long *start,
>> unsigned long *end)
>> {
>> @@ -4820,8 +4820,6 @@ static int ftrace_process_locs(struct module *mod,
>> if (!start_pg)
>> return -ENOMEM;
>>
>> - mutex_lock(&ftrace_lock);
>> -
>> /*
>> * Core and each module needs their own pages, as
>> * modules will free them when they are removed.
>> @@ -4889,8 +4887,18 @@ static int ftrace_process_locs(struct module *mod,
>> local_irq_restore(flags);
>> ret = 0;
>> out:
>> - mutex_unlock(&ftrace_lock);
>> + return ret;
>> +}
> We can remove the out, and all the "goto out" can be changed to return.

Yes, indeed. We would add such cleanup codes.

>
>> +static int ftrace_process_locs(struct module *mod,
>> + unsigned long *start,
>> + unsigned long *end)
>> +{
>> + int ret;
>> +
>> + mutex_lock(&ftrace_lock);
>> + ret = ftrace_process_locs_nolock(mod, start, end);
>> + mutex_unlock(&ftrace_lock);
>> return ret;
>> }
>>
>> @@ -4940,19 +4948,22 @@ void ftrace_release_mod(struct module *mod)
>> mutex_unlock(&ftrace_lock);
>> }
>>
>> -static void ftrace_init_module(struct module *mod,
>> - unsigned long *start, unsigned long *end)
>> +void ftrace_module_init_start(struct module *mod)
>> {
>> + unsigned long *start = mod->ftrace_callsites;
>> + unsigned long *end = mod->ftrace_callsites + mod->num_ftrace_callsites;
>> +
>> + mutex_lock(&ftrace_lock);
>> +
>> if (ftrace_disabled || start == end)
>> return;
>> - ftrace_process_locs(mod, start, end);
>> +
>> + ftrace_process_locs_nolock(mod, start, end);
>> }
>>
>> -void ftrace_module_init(struct module *mod)
>> +void ftrace_module_init_end(struct module *mod)
>> {
>> - ftrace_init_module(mod, mod->ftrace_callsites,
>> - mod->ftrace_callsites +
>> - mod->num_ftrace_callsites);
>> + mutex_unlock(&ftrace_lock);
>> }
>>
>> static int ftrace_module_notify_exit(struct notifier_block *self,
> This seems very hackish, although I can't think of a better way at the
> moment. But I would like not to add more code into module.c if
> possible, and just use a notifier unless there's a real reason we can't
> (as there was when we added the hook in the first place).

We would double-check it again. It's not a good idea to change common
module codes.

Thanks for your kind comments.

Yanmin

2015-12-15 03:26:44

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On 2015/12/15 9:05, Zhang, Yanmin wrote:
> On 2015/12/14 23:51, Steven Rostedt wrote:
>> On Mon, 14 Dec 2015 11:16:18 +0800
>> "Qiu, PeiyangX" <[email protected]> wrote:
>>
>>> We hit ftrace_bug report when booting Android on a 64bit ATOM SOC chip.
>>> Basically, there is a race between insmod and ftrace_run_update_code.
>>>
>>> After load_module=>ftrace_module_init, another thread jumps in to call
>>> ftrace_run_update_code=>ftrace_arch_code_modify_prepare
>>> =>set_all_modules_text_rw, to change all modules
>>> as RW. Since the new module is at MODULE_STATE_UNFORMED, the text attribute
>>> is not changed. Then, the 2nd thread goes ahead to change codes.
>>> However, load_module continues to call
>>> complete_formation=>set_section_ro_nx,
>>> then 2nd thread would fail when probing the module's TEXT.
>>>
>>> Below patch tries to resolve it.
>>>
>>> commit a949ae560a511fe4e3adf48fa44fefded93e5c2b
>>> Author: Steven Rostedt (Red Hat) <[email protected]>
>>> Date: Thu Apr 24 10:40:12 2014 -0400
>>>
>>> ftrace/module: Hardcode ftrace_module_init() call into load_module()
>>>
>>> But it can't fully resolve the issue.
>>>
>>> THis patch holds ftrace_mutex across ftrace_module_init and
>>> complete_formation.
>>>
>>> Signed-off-by: Qiu Peiyang <[email protected]>
>>> Signed-off-by: Zhang Yanmin <[email protected]>
>> First, this patch has major whitespace damage. All tabs are now spaces!
> This seems very hackish, although I can't think of a better way at the
> moment. But I would like not to add more code into module.c if
> possible, and just use a notifier unless there's a real reason we can't
> (as there was when we added the hook in the first place).
> We would double-check it again. It's not a good idea to change common
> module codes.

I double-checked it. If using notifier, we have to add 2 new module notification
events in enum module_state.
For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
At MODULE_STATE_INITING, call ftrace_module_init(mod);
At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar function);
At MODULE_STATE_COMING, call a new function which links the new start_pg
to ftrace_pages.
Such design can also fix another bug in current kernel that ftrace start_pg
of the module is not cleaned up if complete_formation fails.
However, we change module common codes. The new events only serve ftrace.

If not holding ftrace_mutex across start_pg setup and complete_formation,
some other variables are not protected well, such like ftrace_pages,
ftrace_start_up, and so on.

Yanmin

2015-12-15 17:37:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On Tue, 15 Dec 2015 11:26:41 +0800
"Zhang, Yanmin" <[email protected]> wrote:

> > This seems very hackish, although I can't think of a better way at the
> > moment. But I would like not to add more code into module.c if
> > possible, and just use a notifier unless there's a real reason we can't
> > (as there was when we added the hook in the first place).
> > We would double-check it again. It's not a good idea to change common
> > module codes.
>
> I double-checked it. If using notifier, we have to add 2 new module notification
> events in enum module_state.
> For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
> At MODULE_STATE_INITING, call ftrace_module_init(mod);
> At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar function);
> At MODULE_STATE_COMING, call a new function which links the new start_pg
> to ftrace_pages.
> Such design can also fix another bug in current kernel that ftrace start_pg
> of the module is not cleaned up if complete_formation fails.
> However, we change module common codes. The new events only serve ftrace.
>
> If not holding ftrace_mutex across start_pg setup and complete_formation,
> some other variables are not protected well, such like ftrace_pages,
> ftrace_start_up, and so on.

OK, that was basically the reason the hook was added in the first place.

The reason the modules notifier doesn't work here is that there's too
many exits from the module code that may leave the mutex held.

I thought about this some more. So the issue is that we add the module
functions to the ftrace records. Then another task enables function
tracing before the module is fully set up. As ftrace is enabling
function tracing, the module sets its text to read-only, outside the
scope of ftrace setting all text to read-write. When ftrace gets to the
module code, it fails to do the modifications and you get the
ftrace_bug() splat.

Sounds right?

Now, I really dislike the taking of the ftrace mutex and returning back
to module handling. That is very subtle, fragile and prone to bugs. I
took a step back to find another solution and I think I found one.

Take the below patch and add to it (I'll keep this patch as mine, and
you can submit another patch to do the following). You don't need to
resend this patch, just send me a patch that does this:


Add the module notifier that calls ftrace_module_enable(mod), removing
it from ftrace_init_module(). Feel free to monkey with that function to
be able to be called directly if it can't already.

What my patch does is to create a new ftrace record flag DISABLED,
which is set to all functions in a module as it is added. Then later on
(in the module notifier that you will add), the flag is cleared. In
between, the module functions will be ignored.


If the module errors out and the notifier is not called, we don't care.
The records will simply be removed, and the flag will be meaningless.

I needed to move the code around to still let the module code get
enabled if tracing was already enabled. I even cleaned that code up as
I found it has gotten a bit messy.

-- Steve

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 4736a826baf5..660e7c698f3b 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -357,6 +357,7 @@ bool is_ftrace_trampoline(unsigned long addr);
* REGS - the record wants the function to save regs
* REGS_EN - the function is set up to save regs.
* IPMODIFY - the record allows for the IP address to be changed.
+ * DISABLED - the record is not ready to be touched yet
*
* When a new ftrace_ops is registered and wants a function to save
* pt_regs, the rec->flag REGS is set. When the function has been
@@ -371,10 +372,11 @@ enum {
FTRACE_FL_TRAMP = (1UL << 28),
FTRACE_FL_TRAMP_EN = (1UL << 27),
FTRACE_FL_IPMODIFY = (1UL << 26),
+ FTRACE_FL_DISABLED = (1UL << 25),
};

-#define FTRACE_REF_MAX_SHIFT 26
-#define FTRACE_FL_BITS 6
+#define FTRACE_REF_MAX_SHIFT 25
+#define FTRACE_FL_BITS 7
#define FTRACE_FL_MASKED_BITS ((1UL << FTRACE_FL_BITS) - 1)
#define FTRACE_FL_MASK (FTRACE_FL_MASKED_BITS << FTRACE_REF_MAX_SHIFT)
#define FTRACE_REF_MAX ((1UL << FTRACE_REF_MAX_SHIFT) - 1)
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index e290a30f2d0b..e7e15874fb7c 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2023,6 +2023,9 @@ static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)

ftrace_bug_type = FTRACE_BUG_UNKNOWN;

+ if (rec->flags & FTRACE_FL_DISABLED)
+ return FTRACE_UPDATE_IGNORE;
+
/*
* If we are updating calls:
*
@@ -2849,64 +2852,41 @@ ops_references_rec(struct ftrace_ops *ops, struct dyn_ftrace *rec)
return 1;
}

-static int referenced_filters(struct dyn_ftrace *rec)
-{
- struct ftrace_ops *ops;
- int cnt = 0;
-
- for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
- if (ops_references_rec(ops, rec))
- cnt++;
- }
-
- return cnt;
-}
-
static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
{
struct ftrace_page *pg;
struct dyn_ftrace *p;
cycle_t start, stop;
unsigned long update_cnt = 0;
- unsigned long ref = 0;
- bool test = false;
+ unsigned long rec_flags = 0;
int i;

+ start = ftrace_now(raw_smp_processor_id());
+
/*
- * When adding a module, we need to check if tracers are
- * currently enabled and if they are set to trace all functions.
- * If they are, we need to enable the module functions as well
- * as update the reference counts for those function records.
+ * When a module is loaded, this function is called to convert
+ * the calls to mcount in its text to nops, and also to create
+ * an entry in the ftrace data. Now, if ftrace is activated
+ * after this call, but before the module sets its text to
+ * read-only, the modification of enabling ftrace can fail if
+ * the read-only is done while ftrace is converting the calls.
+ * To prevent this, the module's records are set as disabled
+ * and will be enabled after the call to set the module's text
+ * to read-only.
*/
- if (mod) {
- struct ftrace_ops *ops;
-
- for (ops = ftrace_ops_list;
- ops != &ftrace_list_end; ops = ops->next) {
- if (ops->flags & FTRACE_OPS_FL_ENABLED) {
- if (ops_traces_mod(ops))
- ref++;
- else
- test = true;
- }
- }
- }
-
- start = ftrace_now(raw_smp_processor_id());
+ if (mod)
+ rec_flags |= FTRACE_FL_DISABLED;

for (pg = new_pgs; pg; pg = pg->next) {

for (i = 0; i < pg->index; i++) {
- int cnt = ref;

/* If something went wrong, bail without enabling anything */
if (unlikely(ftrace_disabled))
return -1;

p = &pg->records[i];
- if (test)
- cnt += referenced_filters(p);
- p->flags = cnt;
+ p->flags = rec_flags;

/*
* Do the initial record conversion from mcount jump
@@ -2916,21 +2896,6 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
break;

update_cnt++;
-
- /*
- * If the tracing is enabled, go ahead and enable the record.
- *
- * The reason not to enable the record immediatelly is the
- * inherent check of ftrace_make_nop/ftrace_make_call for
- * correct previous instructions. Making first the NOP
- * conversion puts the module to the correct state, thus
- * passing the ftrace_make_call check.
- */
- if (ftrace_start_up && cnt) {
- int failed = __ftrace_replace_code(p, 1);
- if (failed)
- ftrace_bug(failed, p);
- }
}
}

@@ -4938,6 +4903,19 @@ static int ftrace_process_locs(struct module *mod,

#define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)

+static int referenced_filters(struct dyn_ftrace *rec)
+{
+ struct ftrace_ops *ops;
+ int cnt = 0;
+
+ for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
+ if (ops_references_rec(ops, rec))
+ cnt++;
+ }
+
+ return cnt;
+}
+
void ftrace_release_mod(struct module *mod)
{
struct dyn_ftrace *rec;
@@ -4980,12 +4958,82 @@ void ftrace_release_mod(struct module *mod)
mutex_unlock(&ftrace_lock);
}

+static void ftrace_module_enable(struct module *mod)
+{
+ struct dyn_ftrace *rec;
+ struct ftrace_page *pg;
+
+ mutex_lock(&ftrace_lock);
+
+ if (ftrace_disabled)
+ goto out_unlock;
+
+ /*
+ * If the tracing is enabled, go ahead and enable the record.
+ *
+ * The reason not to enable the record immediatelly is the
+ * inherent check of ftrace_make_nop/ftrace_make_call for
+ * correct previous instructions. Making first the NOP
+ * conversion puts the module to the correct state, thus
+ * passing the ftrace_make_call check.
+ *
+ * We also delay this to after the module code already set the
+ * text to read-only, as we now need to set it back to read-write
+ * so that we can modify the text.
+ */
+ if (ftrace_start_up)
+ ftrace_arch_code_modify_prepare();
+
+ do_for_each_ftrace_rec(pg, rec) {
+ int cnt;
+ /*
+ * do_for_each_ftrace_rec() is a double loop.
+ * module text shares the pg. If a record is
+ * not part of this module, then skip this pg,
+ * which the "break" will do.
+ */
+ if (!within_module_core(rec->ip, mod))
+ break;
+
+ cnt = 0;
+
+ /*
+ * When adding a module, we need to check if tracers are
+ * currently enabled and if they are, and can trace this record,
+ * we need to enable the module functions as well as update the
+ * reference counts for those function records.
+ */
+ if (ftrace_start_up)
+ cnt += referenced_filters(rec);
+
+ /* This clears FTRACE_FL_DISABLED */
+ rec->flags = cnt;
+
+ if (ftrace_start_up && cnt) {
+ int failed = __ftrace_replace_code(rec, 1);
+ if (failed) {
+ ftrace_bug(failed, rec);
+ goto out_loop;
+ }
+ }
+
+ } while_for_each_ftrace_rec();
+
+ out_loop:
+ if (ftrace_start_up)
+ ftrace_arch_code_modify_post_process();
+
+ out_unlock:
+ mutex_unlock(&ftrace_lock);
+}
+
static void ftrace_init_module(struct module *mod,
unsigned long *start, unsigned long *end)
{
if (ftrace_disabled || start == end)
return;
ftrace_process_locs(mod, start, end);
+ ftrace_module_enable(mod);
}

void ftrace_module_init(struct module *mod)

2015-12-16 00:54:13

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On 2015/12/16 1:37, Steven Rostedt wrote:
> On Tue, 15 Dec 2015 11:26:41 +0800
> "Zhang, Yanmin" <[email protected]> wrote:
>
>>> This seems very hackish, although I can't think of a better way at the
>>> moment. But I would like not to add more code into module.c if
>>> possible, and just use a notifier unless there's a real reason we can't
>>> (as there was when we added the hook in the first place).
>>> We would double-check it again. It's not a good idea to change common
>>> module codes.
>> I double-checked it. If using notifier, we have to add 2 new module notification
>> events in enum module_state.
>> For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
>> At MODULE_STATE_INITING, call ftrace_module_init(mod);
>> At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar function);
>> At MODULE_STATE_COMING, call a new function which links the new start_pg
>> to ftrace_pages.
>> Such design can also fix another bug in current kernel that ftrace start_pg
>> of the module is not cleaned up if complete_formation fails.
>> However, we change module common codes. The new events only serve ftrace.
>>
>> If not holding ftrace_mutex across start_pg setup and complete_formation,
>> some other variables are not protected well, such like ftrace_pages,
>> ftrace_start_up, and so on.
> OK, that was basically the reason the hook was added in the first place.
>
> The reason the modules notifier doesn't work here is that there's too
> many exits from the module code that may leave the mutex held.

Yes.

>
> I thought about this some more. So the issue is that we add the module
> functions to the ftrace records. Then another task enables function
> tracing before the module is fully set up. As ftrace is enabling
> function tracing, the module sets its text to read-only, outside the
> scope of ftrace setting all text to read-write. When ftrace gets to the
> module code, it fails to do the modifications and you get the
> ftrace_bug() splat.
>
> Sounds right?

Right, indeed.
My case is worse than that as my android uses kernel 3.14.50, which has another
bug around remove_breakpoint as it calls probe_kernel_write directly. The
latest upstream has no such issue. So we just need focus on the race you
explain well.

>
> Now, I really dislike the taking of the ftrace mutex and returning back
> to module handling. That is very subtle, fragile and prone to bugs.

Yes, just like that the current codes are not perfect. :)

> I
> took a step back to find another solution and I think I found one.
>
> Take the below patch and add to it (I'll keep this patch as mine, and
> you can submit another patch to do the following). You don't need to
> resend this patch, just send me a patch that does this:

Ok.

>
>
> Add the module notifier that calls ftrace_module_enable(mod), removing
> it from ftrace_init_module(). Feel free to monkey with that function to
> be able to be called directly if it can't already.
>
> What my patch does is to create a new ftrace record flag DISABLED,
> which is set to all functions in a module as it is added. Then later on
> (in the module notifier that you will add), the flag is cleared. In
> between, the module functions will be ignored.

It's a good idea.


>
> If the module errors out and the notifier is not called, we don't care.
> The records will simply be removed, and the flag will be meaningless.

One question: if complete_formation fails, load_module
wouldn't send any notification and ftrace_release_mod is not called.
How can ftrace core remove all pg->records of the module?

If complete_formation succeeds, further calls cause module errors out,
load_module would send MODULE_STATE_GOING, ftrace_release_mod is called.

Thanks for your quick response.

Yanmin

2015-12-16 10:28:40

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod


On 2015/12/16 1:37, Steven Rostedt wrote:
> On Tue, 15 Dec 2015 11:26:41 +0800
> "Zhang, Yanmin" <[email protected]> wrote:
>
>>> This seems very hackish, although I can't think of a better way at the
>>> moment. But I would like not to add more code into module.c if
>>> possible, and just use a notifier unless there's a real reason we can't
>>> (as there was when we added the hook in the first place).
>>> We would double-check it again. It's not a good idea to change common
>>> module codes.
>> I double-checked it. If using notifier, we have to add 2 new module notification
>> events in enum module_state.
>> For example, MODULE_STATE_INITING, MODULE_STATE_INITING_FAILED.
>> At MODULE_STATE_INITING, call ftrace_module_init(mod);
>> At MODULE_STATE_INITING_FAILED, call ftrace_release_mod (or similar function);
>> At MODULE_STATE_COMING, call a new function which links the new start_pg
>> to ftrace_pages.
>> Such design can also fix another bug in current kernel that ftrace start_pg
>> of the module is not cleaned up if complete_formation fails.
>> However, we change module common codes. The new events only serve ftrace.
>>
>> If not holding ftrace_mutex across start_pg setup and complete_formation,
>> some other variables are not protected well, such like ftrace_pages,
>> ftrace_start_up, and so on.
> OK, that was basically the reason the hook was added in the first place.
>
> The reason the modules notifier doesn't work here is that there's too
> many exits from the module code that may leave the mutex held.
>
> I thought about this some more. So the issue is that we add the module
> functions to the ftrace records. Then another task enables function
> tracing before the module is fully set up. As ftrace is enabling
> function tracing, the module sets its text to read-only, outside the
> scope of ftrace setting all text to read-write. When ftrace gets to the
> module code, it fails to do the modifications and you get the
> ftrace_bug() splat.
>
> Sounds right?
>
> Now, I really dislike the taking of the ftrace mutex and returning back
> to module handling. That is very subtle, fragile and prone to bugs. I
> took a step back to find another solution and I think I found one.
>
> Take the below patch and add to it (I'll keep this patch as mine, and
> you can submit another patch to do the following). You don't need to
> resend this patch, just send me a patch that does this:
>
>
> Add the module notifier that calls ftrace_module_enable(mod), removing
> it from ftrace_init_module(). Feel free to monkey with that function to
> be able to be called directly if it can't already.
>
> What my patch does is to create a new ftrace record flag DISABLED,
> which is set to all functions in a module as it is added. Then later on
> (in the module notifier that you will add), the flag is cleared. In
> between, the module functions will be ignored.
>
>
> If the module errors out and the notifier is not called, we don't care.
> The records will simply be removed, and the flag will be meaningless.
>
> I needed to move the code around to still let the module code get
> enabled if tracing was already enabled. I even cleaned that code up as
> I found it has gotten a bit messy.
>
> -- Steve
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 4736a826baf5..660e7c698f3b 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -357,6 +357,7 @@ bool is_ftrace_trampoline(unsigned long addr);
> * REGS - the record wants the function to save regs
> * REGS_EN - the function is set up to save regs.
> * IPMODIFY - the record allows for the IP address to be changed.
> + * DISABLED - the record is not ready to be touched yet
> *
> * When a new ftrace_ops is registered and wants a function to save
> * pt_regs, the rec->flag REGS is set. When the function has been
> @@ -371,10 +372,11 @@ enum {
> FTRACE_FL_TRAMP = (1UL << 28),
> FTRACE_FL_TRAMP_EN = (1UL << 27),
> FTRACE_FL_IPMODIFY = (1UL << 26),
> + FTRACE_FL_DISABLED = (1UL << 25),
> };
>
> -#define FTRACE_REF_MAX_SHIFT 26
> -#define FTRACE_FL_BITS 6
> +#define FTRACE_REF_MAX_SHIFT 25
> +#define FTRACE_FL_BITS 7
> #define FTRACE_FL_MASKED_BITS ((1UL << FTRACE_FL_BITS) - 1)
> #define FTRACE_FL_MASK (FTRACE_FL_MASKED_BITS << FTRACE_REF_MAX_SHIFT)
> #define FTRACE_REF_MAX ((1UL << FTRACE_REF_MAX_SHIFT) - 1)
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index e290a30f2d0b..e7e15874fb7c 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2023,6 +2023,9 @@ static int ftrace_check_record(struct dyn_ftrace *rec, int enable, int update)
>
> ftrace_bug_type = FTRACE_BUG_UNKNOWN;
>
> + if (rec->flags & FTRACE_FL_DISABLED)
> + return FTRACE_UPDATE_IGNORE;
> +
> /*
> * If we are updating calls:
> *
> @@ -2849,64 +2852,41 @@ ops_references_rec(struct ftrace_ops *ops, struct dyn_ftrace *rec)
> return 1;
> }
>
> -static int referenced_filters(struct dyn_ftrace *rec)
> -{
> - struct ftrace_ops *ops;
> - int cnt = 0;
> -
> - for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> - if (ops_references_rec(ops, rec))
> - cnt++;
> - }
> -
> - return cnt;
> -}
> -
> static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
> {
> struct ftrace_page *pg;
> struct dyn_ftrace *p;
> cycle_t start, stop;
> unsigned long update_cnt = 0;
> - unsigned long ref = 0;
> - bool test = false;
> + unsigned long rec_flags = 0;
> int i;
>
> + start = ftrace_now(raw_smp_processor_id());
> +
> /*
> - * When adding a module, we need to check if tracers are
> - * currently enabled and if they are set to trace all functions.
> - * If they are, we need to enable the module functions as well
> - * as update the reference counts for those function records.
> + * When a module is loaded, this function is called to convert
> + * the calls to mcount in its text to nops, and also to create
> + * an entry in the ftrace data. Now, if ftrace is activated
> + * after this call, but before the module sets its text to
> + * read-only, the modification of enabling ftrace can fail if
> + * the read-only is done while ftrace is converting the calls.
> + * To prevent this, the module's records are set as disabled
> + * and will be enabled after the call to set the module's text
> + * to read-only.
> */
> - if (mod) {
> - struct ftrace_ops *ops;
> -
> - for (ops = ftrace_ops_list;
> - ops != &ftrace_list_end; ops = ops->next) {
> - if (ops->flags & FTRACE_OPS_FL_ENABLED) {
> - if (ops_traces_mod(ops))
> - ref++;
> - else
> - test = true;
> - }
> - }
> - }
> -
> - start = ftrace_now(raw_smp_processor_id());
> + if (mod)
> + rec_flags |= FTRACE_FL_DISABLED;
>
> for (pg = new_pgs; pg; pg = pg->next) {
>
> for (i = 0; i < pg->index; i++) {
> - int cnt = ref;
>
> /* If something went wrong, bail without enabling anything */
> if (unlikely(ftrace_disabled))
> return -1;
>
> p = &pg->records[i];
> - if (test)
> - cnt += referenced_filters(p);
> - p->flags = cnt;
> + p->flags = rec_flags;
>
> /*
> * Do the initial record conversion from mcount jump
> @@ -2916,21 +2896,6 @@ static int ftrace_update_code(struct module *mod, struct ftrace_page *new_pgs)
> break;
>
> update_cnt++;
> -
> - /*
> - * If the tracing is enabled, go ahead and enable the record.
> - *
> - * The reason not to enable the record immediatelly is the
> - * inherent check of ftrace_make_nop/ftrace_make_call for
> - * correct previous instructions. Making first the NOP
> - * conversion puts the module to the correct state, thus
> - * passing the ftrace_make_call check.
> - */
> - if (ftrace_start_up && cnt) {
> - int failed = __ftrace_replace_code(p, 1);
> - if (failed)
> - ftrace_bug(failed, p);
> - }
> }
> }
>
> @@ -4938,6 +4903,19 @@ static int ftrace_process_locs(struct module *mod,
>
> #define next_to_ftrace_page(p) container_of(p, struct ftrace_page, next)
>
> +static int referenced_filters(struct dyn_ftrace *rec)
> +{
> + struct ftrace_ops *ops;
> + int cnt = 0;
> +
> + for (ops = ftrace_ops_list; ops != &ftrace_list_end; ops = ops->next) {
> + if (ops_references_rec(ops, rec))
> + cnt++;
> + }
> +
> + return cnt;
> +}
> +
> void ftrace_release_mod(struct module *mod)
> {
> struct dyn_ftrace *rec;
> @@ -4980,12 +4958,82 @@ void ftrace_release_mod(struct module *mod)
> mutex_unlock(&ftrace_lock);
> }
>
> +static void ftrace_module_enable(struct module *mod)
> +{
> + struct dyn_ftrace *rec;
> + struct ftrace_page *pg;
> +
> + mutex_lock(&ftrace_lock);
> +
> + if (ftrace_disabled)
> + goto out_unlock;
> +
> + /*
> + * If the tracing is enabled, go ahead and enable the record.
> + *
> + * The reason not to enable the record immediatelly is the
> + * inherent check of ftrace_make_nop/ftrace_make_call for
> + * correct previous instructions. Making first the NOP
> + * conversion puts the module to the correct state, thus
> + * passing the ftrace_make_call check.
> + *
> + * We also delay this to after the module code already set the
> + * text to read-only, as we now need to set it back to read-write
> + * so that we can modify the text.
> + */
> + if (ftrace_start_up)
> + ftrace_arch_code_modify_prepare();
> +
> + do_for_each_ftrace_rec(pg, rec) {
> + int cnt;
> + /*
> + * do_for_each_ftrace_rec() is a double loop.
> + * module text shares the pg. If a record is
> + * not part of this module, then skip this pg,
> + * which the "break" will do.
> + */
> + if (!within_module_core(rec->ip, mod))
> + break;
> +
> + cnt = 0;
> +
> + /*
> + * When adding a module, we need to check if tracers are
> + * currently enabled and if they are, and can trace this record,
> + * we need to enable the module functions as well as update the
> + * reference counts for those function records.
> + */
> + if (ftrace_start_up)
> + cnt += referenced_filters(rec);
> +
> + /* This clears FTRACE_FL_DISABLED */
> + rec->flags = cnt;
> +
> + if (ftrace_start_up && cnt) {
> + int failed = __ftrace_replace_code(rec, 1);

If we choose to call ftrace_module_enable when receiving module notification
MODULE_STATE_COMING, TEXT section of the module is already changed to RO.

2015-12-16 14:28:25

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On Wed, 16 Dec 2015 18:28:35 +0800
"Zhang, Yanmin" <[email protected]> wrote:

> > + /*
> > + * If the tracing is enabled, go ahead and enable the record.
> > + *
> > + * The reason not to enable the record immediatelly is the
> > + * inherent check of ftrace_make_nop/ftrace_make_call for
> > + * correct previous instructions. Making first the NOP
> > + * conversion puts the module to the correct state, thus
> > + * passing the ftrace_make_call check.
> > + *
> > + * We also delay this to after the module code already set the
> > + * text to read-only, as we now need to set it back to read-write
> > + * so that we can modify the text.
> > + */
> > + if (ftrace_start_up)
> > + ftrace_arch_code_modify_prepare();
> > +
> > + do_for_each_ftrace_rec(pg, rec) {
> > + int cnt;
> > + /*
> > + * do_for_each_ftrace_rec() is a double loop.
> > + * module text shares the pg. If a record is
> > + * not part of this module, then skip this pg,
> > + * which the "break" will do.
> > + */
> > + if (!within_module_core(rec->ip, mod))
> > + break;
> > +
> > + cnt = 0;
> > +
> > + /*
> > + * When adding a module, we need to check if tracers are
> > + * currently enabled and if they are, and can trace this record,
> > + * we need to enable the module functions as well as update the
> > + * reference counts for those function records.
> > + */
> > + if (ftrace_start_up)
> > + cnt += referenced_filters(rec);
> > +
> > + /* This clears FTRACE_FL_DISABLED */
> > + rec->flags = cnt;
> > +
> > + if (ftrace_start_up && cnt) {
> > + int failed = __ftrace_replace_code(rec, 1);
>
> If we choose to call ftrace_module_enable when receiving module notification
> MODULE_STATE_COMING, TEXT section of the module is already changed to RO.

And that's why we call ftrace_arch_code_modify_prepare(). That should
change all text to RW.

-- Steve

2015-12-17 06:49:12

by Yanmin Zhang

[permalink] [raw]
Subject: Re: [PATCH] ftrace: fix race between ftrace and insmod

On 2015/12/16 22:28, Steven Rostedt wrote:
> On Wed, 16 Dec 2015 18:28:35 +0800
> "Zhang, Yanmin" <[email protected]> wrote:
>
>>> + /*
>>> + * If the tracing is enabled, go ahead and enable the record.
>>> + *
>>> + * The reason not to enable the record immediatelly is the
>>> + * inherent check of ftrace_make_nop/ftrace_make_call for
>>> + * correct previous instructions. Making first the NOP
>>> + * conversion puts the module to the correct state, thus
>>> + * passing the ftrace_make_call check.
>>> + *
>>> + * We also delay this to after the module code already set the
>>> + * text to read-only, as we now need to set it back to read-write
>>> + * so that we can modify the text.
>>> + */
>>> + if (ftrace_start_up)
>>> + ftrace_arch_code_modify_prepare();
>>> +
>>> + do_for_each_ftrace_rec(pg, rec) {
>>> + int cnt;
>>> + /*
>>> + * do_for_each_ftrace_rec() is a double loop.
>>> + * module text shares the pg. If a record is
>>> + * not part of this module, then skip this pg,
>>> + * which the "break" will do.
>>> + */
>>> + if (!within_module_core(rec->ip, mod))
>>> + break;
>>> +
>>> + cnt = 0;
>>> +
>>> + /*
>>> + * When adding a module, we need to check if tracers are
>>> + * currently enabled and if they are, and can trace this record,
>>> + * we need to enable the module functions as well as update the
>>> + * reference counts for those function records.
>>> + */
>>> + if (ftrace_start_up)
>>> + cnt += referenced_filters(rec);
>>> +
>>> + /* This clears FTRACE_FL_DISABLED */
>>> + rec->flags = cnt;
>>> +
>>> + if (ftrace_start_up && cnt) {
>>> + int failed = __ftrace_replace_code(rec, 1);
>> If we choose to call ftrace_module_enable when receiving module notification
>> MODULE_STATE_COMING, TEXT section of the module is already changed to RO.
> And that's why we call ftrace_arch_code_modify_prepare(). That should
> change all text to RW.

Thanks for the kind pointer.
We would add codes into your patch based on notifier and send patch to you by
private email.

Yanmin