Address in __mcount_loc may be invalid if somthing goes wrong.
On our arm64 platform, the bug in recordmcount make kernel
crash in ftrace_init().
https://lore.kernel.org/lkml/[email protected]/
Return -EFAULT if we are dealing with out-of-range condition
to prevent dereference the invalid address in ftrace_bug(),
then the kernel can disable ftrace safely for problematic
__mcount_loc.
Signed-off-by: Mark-PK Tsai <[email protected]>
---
arch/arm64/kernel/ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index b5d3ddaf69d9..98bec8445a58 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -201,7 +201,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
preempt_enable();
if (WARN_ON(!mod))
- return -EINVAL;
+ return -EFAULT;
}
/*
--
2.18.0
On Mon, 7 Jun 2021 11:23:30 +0800
Mark-PK Tsai <[email protected]> wrote:
> Address in __mcount_loc may be invalid if somthing goes wrong.
> On our arm64 platform, the bug in recordmcount make kernel
> crash in ftrace_init().
How did it crash? The link below doesn't show any crash.
>
> https://lore.kernel.org/lkml/[email protected]/
>
> Return -EFAULT if we are dealing with out-of-range condition
> to prevent dereference the invalid address in ftrace_bug(),
> then the kernel can disable ftrace safely for problematic
> __mcount_loc.
!mod is not an out-of-range condition. It just happened that the other
bug caused this strange side-effect. A !mod does not mean a fault
happened. Just because it may have been caused by a fault in your use
case does not mean that it's a fault in all use cases. That's like
saying that your dog is a poodle, so all dogs are poodles.
A return of -EINVAL should not cause a crash. If it does, then that
needs to be fixed.
-- Steve
>
> Signed-off-by: Mark-PK Tsai <[email protected]>
> ---
> arch/arm64/kernel/ftrace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> index b5d3ddaf69d9..98bec8445a58 100644
> --- a/arch/arm64/kernel/ftrace.c
> +++ b/arch/arm64/kernel/ftrace.c
> @@ -201,7 +201,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
> preempt_enable();
>
> if (WARN_ON(!mod))
> - return -EINVAL;
> + return -EFAULT;
> }
>
> /*
> On Mon, 7 Jun 2021 11:23:30 +0800
> Mark-PK Tsai <[email protected]> wrote:
>
> > Address in __mcount_loc may be invalid if somthing goes wrong.
> > On our arm64 platform, the bug in recordmcount make kernel
> > crash in ftrace_init().
>
> How did it crash? The link below doesn't show any crash.
Below is the crash log:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2008 ftrace_bug+0x9c/0x38c
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.4.61-350609-gf78fedda5a5e #1
Hardware name: MediaTek MT5896 (DT)
pstate: 60400089 (nZCv daIf +PAN -UAO)
pc : ftrace_bug+0x9c/0x38c
lr : ftrace_process_locs+0x314/0x3b8
sp : ffffffc011743ef0
x29: ffffffc011743f00 x28: 0000000000000001
x27: ffffff818e401b80 x26: 0000000000000000
x25: ffffff818e480008 x24: ffffffc011749000
x23: 0000000000000008 x22: 0000000000000000
x21: ffffffc010084ac0 x20: 0000000000000024
x19: ffffff818e480000 x18: ffffffc011759c20
x17: ffffffc01133dcf8 x16: 0000000000000068
x15: ffffffc01133dcf8 x14: 0000000000000000
x13: 0000000000000000 x12: ffffffc010084ae4
x11: ffffffc011749000 x10: ffffffc011749000
x9 : 0000000000000001 x8 : ffffffc011749000
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 000000000008e93d x4 : 0000000000000000
x3 : 0000000000000101 x2 : ffffffc010084ac0
x1 : ffffff818e480000 x0 : ffffffc01127621c
Call trace:
ftrace_bug+0x9c/0x38c
ftrace_process_locs+0x314/0x3b8
ftrace_init+0x8c/0xbc
start_kernel+0x180/0x40c
---[ end trace 59db467eb74a6604 ]---
ftrace failed to modify
[<0000000000000024>] 0x24
actual:
"Unable to handle kernel read from unreadable memory at virtual address 0000000000000024
"Mem abort info:
And the crash is becuase kernel trying to read *rec->ip in print_ip_ins() if
ftrace_bug() get error code -EINVAL.
>
> >
> > https://lore.kernel.org/lkml/[email protected]/
> >
> > Return -EFAULT if we are dealing with out-of-range condition
> > to prevent dereference the invalid address in ftrace_bug(),
> > then the kernel can disable ftrace safely for problematic
> > __mcount_loc.
>
> !mod is not an out-of-range condition. It just happened that the other
> bug caused this strange side-effect. A !mod does not mean a fault
> happened. Just because it may have been caused by a fault in your use
> case does not mean that it's a fault in all use cases. That's like
> saying that your dog is a poodle, so all dogs are poodles.
>
> A return of -EINVAL should not cause a crash. If it does, then that
> needs to be fixed.
I understand.
Keep -EINVAL here make more sense.
So maybe we should handle this case in ftrace_bug() by checking the rec->ip?
>
> -- Steve
>
>
> >
> > Signed-off-by: Mark-PK Tsai <[email protected]>
> > ---
> > arch/arm64/kernel/ftrace.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> > index b5d3ddaf69d9..98bec8445a58 100644
> > --- a/arch/arm64/kernel/ftrace.c
> > +++ b/arch/arm64/kernel/ftrace.c
> > @@ -201,7 +201,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
> > preempt_enable();
> >
> > if (WARN_ON(!mod))
> > - return -EINVAL;
> > + return -EFAULT;
> > }
> >
> > /*
> > On Mon, 7 Jun 2021 11:23:30 +0800
> > Mark-PK Tsai <[email protected]> wrote:
> >
> > > Address in __mcount_loc may be invalid if somthing goes wrong.
> > > On our arm64 platform, the bug in recordmcount make kernel
> > > crash in ftrace_init().
> >
> > How did it crash? The link below doesn't show any crash.
>
> Below is the crash log:
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2008 ftrace_bug+0x9c/0x38c
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.4.61-350609-gf78fedda5a5e #1
> Hardware name: MediaTek MT5896 (DT)
> pstate: 60400089 (nZCv daIf +PAN -UAO)
> pc : ftrace_bug+0x9c/0x38c
> lr : ftrace_process_locs+0x314/0x3b8
> sp : ffffffc011743ef0
> x29: ffffffc011743f00 x28: 0000000000000001
> x27: ffffff818e401b80 x26: 0000000000000000
> x25: ffffff818e480008 x24: ffffffc011749000
> x23: 0000000000000008 x22: 0000000000000000
> x21: ffffffc010084ac0 x20: 0000000000000024
> x19: ffffff818e480000 x18: ffffffc011759c20
> x17: ffffffc01133dcf8 x16: 0000000000000068
> x15: ffffffc01133dcf8 x14: 0000000000000000
> x13: 0000000000000000 x12: ffffffc010084ae4
> x11: ffffffc011749000 x10: ffffffc011749000
> x9 : 0000000000000001 x8 : ffffffc011749000
> x7 : 0000000000000000 x6 : 000000000000003f
> x5 : 000000000008e93d x4 : 0000000000000000
> x3 : 0000000000000101 x2 : ffffffc010084ac0
> x1 : ffffff818e480000 x0 : ffffffc01127621c
> Call trace:
> ftrace_bug+0x9c/0x38c
> ftrace_process_locs+0x314/0x3b8
> ftrace_init+0x8c/0xbc
> start_kernel+0x180/0x40c
> ---[ end trace 59db467eb74a6604 ]---
> ftrace failed to modify
> [<0000000000000024>] 0x24
> actual:
> "Unable to handle kernel read from unreadable memory at virtual address 0000000000000024
> "Mem abort info:
I'm sorry that the last reply I only post the warning log before crash.
Below is the panic log right after this warning.
"Unable to handle kernel read from unreadable memory at virtual address 0000000000000024
"Mem abort info:
" ESR = 0x96000005
" EC = 0x25: DABT (current EL), IL = 32 bits
" SET = 0, FnV = 0
" EA = 0, S1PTW = 0
"Data abort info:
" ISV = 0, ISS = 0x00000005
" CM = 0, WnR = 0
"[0000000000000024] user address but active_mm is swapper
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Tainted: G W 5.4.61-350609-gf78fedda5a5e #1
pstate: 60400089 (nZCv daIf +PAN -UAO)
pc : ftrace_bug+0xd8/0x38c
lr : ftrace_bug+0xd8/0x38c
sp : ffffffc011743ef0
x29: ffffffc011743f00 x28: 0000000000000001
x27: ffffff818e401b80 x26: 0000000000000000
x25: ffffff818e480008 x24: ffffffc011749000
x23: 0000000000000008 x22: 0000000000000000
x21: ffffffc010084ac0 x20: 0000000000000024
x19: ffffff818e480000 x18: ffffffc011759c20
x17: 0000000000000031 x16: ffffffc010d40c38
x15: ffffffc0114ad9ef x14: 000000000000004e
x13: 0000000000000000 x12: 0000000000000000
x11: 0000000000000000 x10: 00000000ffffffff
x9 : 0000000000000000 x8 : 0000000000000000
x7 : 000000000000000b x6 : ffffffc01188720b
x5 : 000000000000000b x4 : ffffffc0118848ce
x3 : 0000000000000020 x2 : 000000000000000b
x1 : ffffffc0118848d9 x0 : 000000000000000b
Call trace:
ftrace_bug+0xd8/0x38c
ftrace_process_locs+0x314/0x3b8
ftrace_init+0x8c/0xbc
start_kernel+0x180/0x40c
Code: f00085a1 9120d000 913e4421 97fdef85 (39400282)
---[ end trace 59db467eb74a6605 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
>
> And the crash is becuase kernel trying to read *rec->ip in print_ip_ins() if
> ftrace_bug() get error code -EINVAL.
>
> >
> > >
> > > https://lore.kernel.org/lkml/[email protected]/
> > >
> > > Return -EFAULT if we are dealing with out-of-range condition
> > > to prevent dereference the invalid address in ftrace_bug(),
> > > then the kernel can disable ftrace safely for problematic
> > > __mcount_loc.
> >
> > !mod is not an out-of-range condition. It just happened that the other
> > bug caused this strange side-effect. A !mod does not mean a fault
> > happened. Just because it may have been caused by a fault in your use
> > case does not mean that it's a fault in all use cases. That's like
> > saying that your dog is a poodle, so all dogs are poodles.
> >
> > A return of -EINVAL should not cause a crash. If it does, then that
> > needs to be fixed.
>
> I understand.
> Keep -EINVAL here make more sense.
> So maybe we should handle this case in ftrace_bug() by checking the rec->ip?
>
> >
> > -- Steve
> >
> >
> > >
> > > Signed-off-by: Mark-PK Tsai <[email protected]>
> > > ---
> > > arch/arm64/kernel/ftrace.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
> > > index b5d3ddaf69d9..98bec8445a58 100644
> > > --- a/arch/arm64/kernel/ftrace.c
> > > +++ b/arch/arm64/kernel/ftrace.c
> > > @@ -201,7 +201,7 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec,
> > > preempt_enable();
> > >
> > > if (WARN_ON(!mod))
> > > - return -EINVAL;
> > > + return -EFAULT;
> > > }
> > >
> > > /*
>
On Mon, 7 Jun 2021 22:15:22 +0800
Mark-PK Tsai <[email protected]> wrote:
> And the crash is becuase kernel trying to read *rec->ip in print_ip_ins() if
> ftrace_bug() get error code -EINVAL.
Right, so the actual fix should be something like this:
[ not tested, nor even compiled ]
-- Steve
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2e8a3fde7104..72ef4dccbcc4 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1967,12 +1967,18 @@ static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops,
static void print_ip_ins(const char *fmt, const unsigned char *p)
{
+ char ins[MCOUNT_INSN_SIZE];
int i;
+ if (copy_from_kernel_nofault(ins, p, MCOUNT_INSN_SIZE)) {
+ printk(KERN_CONT "%s[FAULT](%px)", fmt, p);
+ return;
+ }
+
printk(KERN_CONT "%s", fmt);
for (i = 0; i < MCOUNT_INSN_SIZE; i++)
- printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]);
+ printk(KERN_CONT "%s%02x", i ? ":" : "", ins[i]);
}
enum ftrace_bug_type ftrace_bug_type;
> On Mon, 7 Jun 2021 22:15:22 +0800
> Mark-PK Tsai <[email protected]> wrote:
>
> > And the crash is becuase kernel trying to read *rec->ip in print_ip_ins() if
> > ftrace_bug() get error code -EINVAL.
>
> Right, so the actual fix should be something like this:
>
> [ not tested, nor even compiled ]
>
> -- Steve
>
I just test this patch, and turns out it work well.
All my modification is only rename copy_from_kernel_nofault
into probe_kernel_read because the older kernel version
I use.
Then the kernel will show the error mesage instead of panic as following.
[ 0.000000] ftrace failed to modify
[ 0.000000] [<0000000000000020>] 0x20
[ 0.000000] actual: [FAULT](0000000000000020)
Should I resend this patch as v2?
Or you will upstream this fix?
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 2e8a3fde7104..72ef4dccbcc4 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -1967,12 +1967,18 @@ static int ftrace_hash_ipmodify_update(struct ftrace_ops *ops,
>
> static void print_ip_ins(const char *fmt, const unsigned char *p)
> {
> + char ins[MCOUNT_INSN_SIZE];
> int i;
>
> + if (copy_from_kernel_nofault(ins, p, MCOUNT_INSN_SIZE)) {
> + printk(KERN_CONT "%s[FAULT](%px)", fmt, p);
> + return;
> + }
> +
> printk(KERN_CONT "%s", fmt);
>
> for (i = 0; i < MCOUNT_INSN_SIZE; i++)
> - printk(KERN_CONT "%s%02x", i ? ":" : "", p[i]);
> + printk(KERN_CONT "%s%02x", i ? ":" : "", ins[i]);
> }
>
> enum ftrace_bug_type ftrace_bug_type;
On Tue, 8 Jun 2021 02:14:03 +0800
Mark-PK Tsai <[email protected]> wrote:
> Should I resend this patch as v2?
> Or you will upstream this fix?
I'll push it. I have some other fixes to add as well.
Thanks!
-- Steve
> On Tue, 8 Jun 2021 02:14:03 +0800
> Mark-PK Tsai <[email protected]> wrote:
>
> > Should I resend this patch as v2?
> > Or you will upstream this fix?
>
> I'll push it. I have some other fixes to add as well.
>
> Thanks!
>
> -- Steve
Thanks!
Here is my reported and tested by.
Reported-and-tested-by: Mark-PK Tsai <[email protected]>