2014-02-17 15:23:04

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 0/4] x86: Fix ftrace recovery when code modification failed

Ftrace modifies function calls using Int3 breakpoints on x86. It patches
all functions in parallel to reduce the number of sync() calls. There is
a code that removes pending Int3 breakpoints when something goes wrong.

The recovery does not work on x86_64. I simulated an error in
ftrace_replace_code() and the system got rebooted.

This patch set fixes the recovery code, improves debugging of this
type of problem, and does some clean up.

BTW: It is an echo from the patch set that tried to use text_poke_bp()
instead of the ftrace-specific Int3 based patching code. Ftrace has
some specific restrictions. I did not find a way how to make the
universal text_poke_bp effective, safe, and clean to be usable
there.

The patch set is against linux/tip. Last commit is a5b3cca53c43c3ba7
(Merge tag 'v3.14-rc3')

Petr Mladek (4):
x86: Clean up remove_breakpoint() in ftrace code
x86: Fix ftrace patching recovery code to work on x86_64
x86: BUG when ftrace patching recovery fails
x86: Fix order of warning messages when ftrace modifies code

arch/x86/kernel/ftrace.c | 67 ++++++++++++++++++++++++------------------------
1 file changed, 34 insertions(+), 33 deletions(-)

--
1.8.4


2014-02-17 15:23:23

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 1/4] x86: Clean up remove_breakpoint() in ftrace code

The ftrace recovery code does not work on x86_64. I tried to
debug it and got a bit confused by remove_breakpoint(). One problem
was reusing the "nop" pointer for different types of code. Also
the combination of two level if-condition and goto was a bit
tricky.

Signed-off-by: Petr Mladek <[email protected]>
---
arch/x86/kernel/ftrace.c | 58 ++++++++++++++++++++++--------------------------
1 file changed, 26 insertions(+), 32 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e6253195a301..e7f3f3f565de 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -405,17 +405,16 @@ static int add_breakpoints(struct dyn_ftrace *rec, int enable)

/*
* On error, we need to remove breakpoints. This needs to
- * be done caefully. If the address does not currently have a
+ * be done carefully. If the address does not currently have a
* breakpoint, we know we are done. Otherwise, we look at the
- * remaining 4 bytes of the instruction. If it matches a nop
- * we replace the breakpoint with the nop. Otherwise we replace
- * it with the call instruction.
+ * remaining 4 bytes of the instruction. It has to be a valid
+ * code. If not, don't touch the breakpoint, we would just
+ * create a disaster.
*/
static int remove_breakpoint(struct dyn_ftrace *rec)
{
unsigned char ins[MCOUNT_INSN_SIZE];
- unsigned char brk = BREAKPOINT_INSTRUCTION;
- const unsigned char *nop;
+ const unsigned char *valid_ins;
unsigned long ftrace_addr;
unsigned long ip = rec->ip;

@@ -424,38 +423,33 @@ static int remove_breakpoint(struct dyn_ftrace *rec)
return -EFAULT;

/* If this does not have a breakpoint, we are done */
- if (ins[0] != brk)
+ if (ins[0] != BREAKPOINT_INSTRUCTION)
return -1;

- nop = ftrace_nop_replace();
+ /* Check if it is nop instruction */
+ valid_ins = ftrace_nop_replace();
+ if (memcmp(&ins[1], &valid_ins[1], MCOUNT_INSN_SIZE - 1) == 0)
+ goto update;

- /*
- * If the last 4 bytes of the instruction do not match
- * a nop, then we assume that this is a call to ftrace_addr.
- */
- if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0) {
- /*
- * For extra paranoidism, we check if the breakpoint is on
- * a call that would actually jump to the ftrace_addr.
- * If not, don't touch the breakpoint, we make just create
- * a disaster.
- */
- ftrace_addr = get_ftrace_addr(rec);
- nop = ftrace_call_replace(ip, ftrace_addr);
-
- if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) == 0)
- goto update;
-
- /* Check both ftrace_addr and ftrace_old_addr */
- ftrace_addr = get_ftrace_old_addr(rec);
- nop = ftrace_call_replace(ip, ftrace_addr);

- if (memcmp(&ins[1], &nop[1], MCOUNT_INSN_SIZE - 1) != 0)
- return -EINVAL;
- }
+ /* Or it might be ftrace call instruction */
+ ftrace_addr = get_ftrace_addr(rec);
+ valid_ins = ftrace_call_replace(ip, ftrace_addr);
+ if (memcmp(&ins[1], &valid_ins[1], MCOUNT_INSN_SIZE - 1) == 0)
+ goto update;
+
+ /* Last chance, it might be the old ftrace call instruction */
+ ftrace_addr = get_ftrace_old_addr(rec);
+ valid_ins = ftrace_call_replace(ip, ftrace_addr);
+ if (memcmp(&ins[1], &valid_ins[1], MCOUNT_INSN_SIZE - 1) == 0)
+ goto update;
+
+ /* Hmm, it is an unknown code. Rather bail out. */
+ return -EINVAL;

update:
- return probe_kernel_write((void *)ip, &nop[0], 1);
+ /* Put back the first byte */
+ return probe_kernel_write((void *)ip, valid_ins, 1);
}

static int add_update_code(unsigned long ip, unsigned const char *new)
--
1.8.4

2014-02-17 15:23:43

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 2/4] x86: Fix ftrace patching recovery code to work on x86_64

Ftrace modifies function calls using Int3 breakpoints on x86. It patches
all functions in parallel to reduce the number of sync() calls. If some
function cannot be modified, it tries to recover and remove the other
pending breakpoints.

The recovery currently does not work on x86_64 because the address
is read-only. We need to use ftrace_write() that gets write access
via the kernel identity mapping. It is used everywhere else in the
code.

Signed-off-by: Petr Mladek <[email protected]>
---
arch/x86/kernel/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e7f3f3f565de..30d63c4a4195 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -449,7 +449,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec)

update:
/* Put back the first byte */
- return probe_kernel_write((void *)ip, valid_ins, 1);
+ return ftrace_write(ip, valid_ins, 1);
}

static int add_update_code(unsigned long ip, unsigned const char *new)
--
1.8.4

2014-02-17 15:24:04

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 3/4] x86: BUG when ftrace patching recovery fails

Ftrace modifies function calls using Int3 breakpoints on x86.
The breakpoints are handled only when the patching is in progress.
If something goes wrong, there is a recovery code that removes
the breakpoints. If this fails, the system might get silently
rebooted when a remaining break is not handled or an invalid
instruction is proceed.

A better solution is to BUG() when the recovery fails. It helps
to point to the sinner responsible for the reboot.

Signed-off-by: Petr Mladek <[email protected]>
---
arch/x86/kernel/ftrace.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 30d63c4a4195..525a9f954c8b 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -424,7 +424,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec)

/* If this does not have a breakpoint, we are done */
if (ins[0] != BREAKPOINT_INSTRUCTION)
- return -1;
+ return 0;

/* Check if it is nop instruction */
valid_ins = ftrace_nop_replace();
@@ -625,8 +625,15 @@ void ftrace_replace_code(int enable)
ftrace_bug(ret, rec ? rec->ip : 0);
printk(KERN_WARNING "Failed on %s (%d):\n", report, count);
for_ftrace_rec_iter(iter) {
+ int err;
+
rec = ftrace_rec_iter_record(iter);
- remove_breakpoint(rec);
+ err = remove_breakpoint(rec);
+ /*
+ * The breakpoints will not be handled after this function
+ * finishes. Let's stop on a well defined point.
+ */
+ BUG_ON(err);
}
}

--
1.8.4

2014-02-17 15:24:24

by Petr Mladek

[permalink] [raw]
Subject: [PATCH 4/4] x86: Fix order of warning messages when ftrace modifies code

The colon at the end of the printk message suggests that it should get printed
before the details printed by ftrace_bug().

When touching the line, let's use the preferred pr_warn() macro as suggested
by checkpatch.pl.

Signed-off-by: Petr Mladek <[email protected]>
---
arch/x86/kernel/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 525a9f954c8b..ad7c38f5206b 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -622,8 +622,8 @@ void ftrace_replace_code(int enable)
return;

remove_breakpoints:
+ pr_warn("Failed on %s (%d):\n", report, count);
ftrace_bug(ret, rec ? rec->ip : 0);
- printk(KERN_WARNING "Failed on %s (%d):\n", report, count);
for_ftrace_rec_iter(iter) {
int err;

--
1.8.4

2014-02-21 04:23:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/4] x86: Fix ftrace recovery when code modification failed

On Mon, 17 Feb 2014 16:22:49 +0100
Petr Mladek <[email protected]> wrote:

> Ftrace modifies function calls using Int3 breakpoints on x86. It patches
> all functions in parallel to reduce the number of sync() calls. There is
> a code that removes pending Int3 breakpoints when something goes wrong.
>
> The recovery does not work on x86_64. I simulated an error in
> ftrace_replace_code() and the system got rebooted.

Thanks for the report, I just did the same and it caused a reboot too.

>
> This patch set fixes the recovery code, improves debugging of this
> type of problem, and does some clean up.
>
> BTW: It is an echo from the patch set that tried to use text_poke_bp()
> instead of the ftrace-specific Int3 based patching code. Ftrace has
> some specific restrictions. I did not find a way how to make the
> universal text_poke_bp effective, safe, and clean to be usable
> there.
>
> The patch set is against linux/tip. Last commit is a5b3cca53c43c3ba7
> (Merge tag 'v3.14-rc3')
>
> Petr Mladek (4):
> x86: Clean up remove_breakpoint() in ftrace code
> x86: Fix ftrace patching recovery code to work on x86_64
> x86: BUG when ftrace patching recovery fails
> x86: Fix order of warning messages when ftrace modifies code
>
> arch/x86/kernel/ftrace.c | 67 ++++++++++++++++++++++++------------------------
> 1 file changed, 34 insertions(+), 33 deletions(-)
>

That's quite a bit of changes. I did a quick debug analysis of it, and
found two bugs.

One, I shouldn't be using direct access to the ip with
probe_kernel_write(), I need to do the within() magic (also have a
ftrace_write() wrapper now).

The second bug is that I need to do another run_sync() after the fix
up. Can you see if this patch fixes the issue?

-- Steve

ftrace.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e625319..6b566c8 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -455,7 +455,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec)
}

update:
- return probe_kernel_write((void *)ip, &nop[0], 1);
+ return ftrace_write(ip, nop, 1);
}

static int add_update_code(unsigned long ip, unsigned const char *new)
@@ -634,6 +634,7 @@ void ftrace_replace_code(int enable)
rec = ftrace_rec_iter_record(iter);
remove_breakpoint(rec);
}
+ run_sync();
}

static int
@@ -664,7 +665,7 @@ ftrace_modify_code(unsigned long ip, unsigned const
char *old_code, return ret;

fail_update:
- probe_kernel_write((void *)ip, &old_code[0], 1);
+ ftrace_write(ip, old_code, 1);
goto out;
}

2014-02-21 16:33:55

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH 0/4] x86: Fix ftrace recovery when code modification failed

On Thu 20-02-14 23:23:08, Steven Rostedt wrote:
> On Mon, 17 Feb 2014 16:22:49 +0100
> Petr Mladek <[email protected]> wrote:
>
> > Ftrace modifies function calls using Int3 breakpoints on x86. It patches
> > all functions in parallel to reduce the number of sync() calls. There is
> > a code that removes pending Int3 breakpoints when something goes wrong.
> >
> > The recovery does not work on x86_64. I simulated an error in
> > ftrace_replace_code() and the system got rebooted.
>
> Thanks for the report, I just did the same and it caused a reboot too.
>
> > The patch set is against linux/tip. Last commit is a5b3cca53c43c3ba7
> > (Merge tag 'v3.14-rc3')
> >
> > Petr Mladek (4):
> > x86: Clean up remove_breakpoint() in ftrace code
> > x86: Fix ftrace patching recovery code to work on x86_64
> > x86: BUG when ftrace patching recovery fails
> > x86: Fix order of warning messages when ftrace modifies code
> >
> > arch/x86/kernel/ftrace.c | 67 ++++++++++++++++++++++++------------------------
> > 1 file changed, 34 insertions(+), 33 deletions(-)
> >
>
> That's quite a bit of changes. I did a quick debug analysis of it, and
> found two bugs.

The first commit did refactoring to make the code better readable.
The real fix was in the second patch.

> One, I shouldn't be using direct access to the ip with
> probe_kernel_write(), I need to do the within() magic (also have a
> ftrace_write() wrapper now).

Yup, this was the main reason why it failed. Huh, I missed that the
same problem was also in ftrace_modify_code().


> The second bug is that I need to do another run_sync() after the fix
> up.

Great catch! I have missed the sync.

> Can you see if this patch fixes the issue?

Yes, it works but I would still do few more changes:

+ run_sync() also in ftrace_modify_code() in the recovery
part; I would solve this by moving the "out" label.

+ print some warning in update_ftrace_func() when
ftrace_modify_code() fails; otherwise, the error
is silently ignored

+ BUG when the breakpoint cannot be removed; the system
silently crash anyway because the breakpoint will not
be handled

I wonder if we should call FTRACE_WARN_ON_ONCE that calls
"ftrace_kill" when update_ftrace_func() fails.

Anyway, here is a proposed patch that merges your changes and my
improvements. Let me know if I should improve, rework or split it.

--------

Subject: [PATCH] x86: Fix ftrace recovery code

Ftrace modifies function calls using Int3 breakpoints on x86. If some
function cannot be modified, it tries to recover and remove the pending
breakpoints.

The recovery currently does not work on x86_64 because the address
is read-only. We need to use ftrace_write() that gets write access
via the kernel identity mapping. It is used everywhere else in the
code.

Note that we need to modify remove_breakpoint() to return non-zero
value only when there is an error. The return value was ignored before,
so it does not cause any troubles.

In addition, we should print some warning when update_ftrace_func()
fails. Otherwise, ftrace does not work as expected but there is
no explanation.

Finally, we should BUG() when the breakpoint could not be removed.
Otherwise, the system silently crashes when the function finishes
the Int3 handler is disabled.

Signed-off-by: Petr Mladek <[email protected]>
---
arch/x86/kernel/ftrace.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index e6253195a301..dc3900d84b09 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -239,6 +239,7 @@ static int update_ftrace_func(unsigned long ip, void *new)
atomic_inc(&modifying_ftrace_code);

ret = ftrace_modify_code(ip, old, new);
+ WARN_ONCE(ret, "Failed to update ftrace function: %d\n", ret);

atomic_dec(&modifying_ftrace_code);

@@ -425,7 +426,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec)

/* If this does not have a breakpoint, we are done */
if (ins[0] != brk)
- return -1;
+ return 0;

nop = ftrace_nop_replace();

@@ -455,7 +456,7 @@ static int remove_breakpoint(struct dyn_ftrace *rec)
}

update:
- return probe_kernel_write((void *)ip, &nop[0], 1);
+ return ftrace_write(ip, nop, 1);
}

static int add_update_code(unsigned long ip, unsigned const char *new)
@@ -632,8 +633,14 @@ void ftrace_replace_code(int enable)
printk(KERN_WARNING "Failed on %s (%d):\n", report, count);
for_ftrace_rec_iter(iter) {
rec = ftrace_rec_iter_record(iter);
- remove_breakpoint(rec);
+ /*
+ * Breakpoints are handled only when this function is in
+ * progress. The system could not work with them.
+ */
+ if (remove_breakpoint(rec))
+ BUG();
}
+ run_sync();
}

static int
@@ -655,16 +662,20 @@ ftrace_modify_code(unsigned long ip, unsigned const char *old_code,
run_sync();

ret = ftrace_write(ip, new_code, 1);
- if (ret) {
- ret = -EPERM;
- goto out;
- }
- run_sync();
+ /*
+ * The breakpoint is handled only when this function is in progress.
+ * The system could not work if we could not remove it.
+ */
+ BUG_ON(ret);
out:
+ run_sync();
+
return ret;

fail_update:
- probe_kernel_write((void *)ip, &old_code[0], 1);
+ /* Also here the system could not work with the breakpoint */
+ if (ftrace_write(ip, old_code, 1))
+ BUG();
goto out;
}

--
1.8.4

2014-02-21 18:01:13

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/4] x86: Fix ftrace recovery when code modification failed

On Fri, 21 Feb 2014 17:33:52 +0100
Petr Mládek <[email protected]> wrote:

> Yes, it works but I would still do few more changes:
>
> + run_sync() also in ftrace_modify_code() in the recovery
> part; I would solve this by moving the "out" label.

Sounds good.

>
> + print some warning in update_ftrace_func() when
> ftrace_modify_code() fails; otherwise, the error
> is silently ignored

The error is passed down to the callers. Any warning should happen
there. We can move that up to the generic code. Otherwise, this is a
x86 only fix, and we leave all the other archs broken.

ftrace_modify_all_code() looks to be where the warnings should be. And
yes, any warning should also call ftrace_bug(). That can be done via a
FTRACE_WARN_ON().

>
> + BUG when the breakpoint cannot be removed; the system
> silently crash anyway because the breakpoint will not
> be handled
>
> I wonder if we should call FTRACE_WARN_ON_ONCE that calls
> "ftrace_kill" when update_ftrace_func() fails.

Don't need the _ONCE() version. Something like this (I'll let you make
the patch ;-)

int ret = 0;

if (update)
ret = ftrace_update_ftrace_func(ftrace_ops_list_func);
if (ret)
goto out;

if (command & FTRACE_UPDATE_CALLS)
ret = ftrace_replace_code(1);
else if (command & FTRACE_DISABLE_CALLS)
ret = ftrace_replace_code(0);

if (ret)
goto out;

if (update && ftrace_trace_function != ftrace_ops_list_func) {
function_trace_op = set_function_trace_op;
smp_wmb();
/* If irqs are disabled, we are in stop machine */
if (!irqs_disabled())
smp_call_function(ftrace_sync_ipi, NULL, 1);
ret = ftrace_update_ftrace_func(ftrace_trace_function);
}

if (ret)
goto out;

if (command & FTRACE_START_FUNC_RET)
ret = ftrace_enable_ftrace_graph_caller();
else if (command & FTRACE_STOP_FUNC_RET)
ret = ftrace_disable_ftrace_graph_caller();

out:
FTRACE_WARN_ON(ret);

Or if you want to keep where it failed, just replace the if (ret) goto
out, with:

if (FTRACE_WARN_ON(ret))
return;

That may actually be better.

I've already added my patch. I'll push it up to my ftrace/urgent branch
at

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git

And you can write a patch based on that.

Thanks!

-- Steve