2024-05-01 16:31:25

by Stephen Brennan

[permalink] [raw]
Subject: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

sudo perf probe --add commit_creds
sudo perf trace -e probe:commit_creds
# In another terminal
make
sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
# Back to perf terminal
# ctrl-c
sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan <[email protected]>
---
Changes in v3:
Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
variable and check it directly in the kprobe handlers.
Link to v1/v2 discussion:
https://lore.kernel.org/all/[email protected]/

arch/csky/kernel/probes/ftrace.c | 3 +++
arch/loongarch/kernel/ftrace_dyn.c | 3 +++
arch/parisc/kernel/ftrace.c | 3 +++
arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
arch/riscv/kernel/probes/ftrace.c | 3 +++
arch/s390/kernel/ftrace.c | 3 +++
arch/x86/kernel/kprobes/ftrace.c | 3 +++
include/linux/kprobes.h | 7 +++++++
kernel/kprobes.c | 6 ++++++
kernel/trace/ftrace.c | 1 +
10 files changed, 35 insertions(+)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..7ba4b98076de 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe_ctlblk *kcb;
struct pt_regs *regs;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..bff058317062 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..c91f9c2e61ed 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe *p;
int bit;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..f8208c027148 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
struct pt_regs *regs;
int bit;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..a69dfa610aa8 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe_ctlblk *kcb;
int bit;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..7f6f8c438c26 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe *p;
int bit;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index dd2ec14adb77..15af7e98e161 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct kprobe_ctlblk *kcb;
int bit;

+ if (unlikely(kprobe_ftrace_disabled))
+ return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 0ff44d6633e3..5fcbc254d186 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -378,11 +378,15 @@ static inline void wait_for_kprobe_optimizer(void) { }
extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *ops, struct ftrace_regs *fregs);
extern int arch_prepare_kprobe_ftrace(struct kprobe *p);
+/* Set when ftrace has been killed: kprobes on ftrace must be disabled for safety */
+extern bool kprobe_ftrace_disabled __read_mostly;
+extern void kprobe_ftrace_kill(void);
#else
static inline int arch_prepare_kprobe_ftrace(struct kprobe *p)
{
return -EINVAL;
}
+static inline void kprobe_ftrace_kill(void) {}
#endif /* CONFIG_KPROBES_ON_FTRACE */

/* Get the kprobe at this addr (if any) - called with preemption disabled */
@@ -495,6 +499,9 @@ static inline void kprobe_flush_task(struct task_struct *tk)
static inline void kprobe_free_init_mem(void)
{
}
+static inline void kprobe_ftrace_kill(void)
+{
+}
static inline int disable_kprobe(struct kprobe *kp)
{
return -EOPNOTSUPP;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 65adc815fc6e..166ebf81dc45 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1068,6 +1068,7 @@ static struct ftrace_ops kprobe_ipmodify_ops __read_mostly = {

static int kprobe_ipmodify_enabled;
static int kprobe_ftrace_enabled;
+bool kprobe_ftrace_disabled;

static int __arm_kprobe_ftrace(struct kprobe *p, struct ftrace_ops *ops,
int *cnt)
@@ -1136,6 +1137,11 @@ static int disarm_kprobe_ftrace(struct kprobe *p)
ipmodify ? &kprobe_ipmodify_ops : &kprobe_ftrace_ops,
ipmodify ? &kprobe_ipmodify_enabled : &kprobe_ftrace_enabled);
}
+
+void kprobe_ftrace_kill()
+{
+ kprobe_ftrace_disabled = true;
+}
#else /* !CONFIG_KPROBES_ON_FTRACE */
static inline int arm_kprobe_ftrace(struct kprobe *p)
{
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index da1710499698..96db99c347b3 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -7895,6 +7895,7 @@ void ftrace_kill(void)
ftrace_disabled = 1;
ftrace_enabled = 0;
ftrace_trace_function = ftrace_stub;
+ kprobe_ftrace_kill();
}

/**
--
2.39.3



2024-05-01 17:24:26

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

On Wed, 1 May 2024 09:29:56 -0700
Stephen Brennan <[email protected]> wrote:

> If an error happens in ftrace, ftrace_kill() will prevent disarming
> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> freed, yet the kprobes will still be active, and when triggered, they
> will use the freed memory, likely resulting in a page fault and panic.
>
> This behavior can be reproduced quite easily, by creating a kprobe and
> then triggering a ftrace_kill(). For simplicity, we can simulate an
> ftrace error with a kernel module like [1]:
>
> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>
> sudo perf probe --add commit_creds
> sudo perf trace -e probe:commit_creds
> # In another terminal
> make
> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
> # Back to perf terminal
> # ctrl-c
> sudo perf probe --del commit_creds
>
> After a short period, a page fault and panic would occur as the kprobe
> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> is supposed to be used only in extreme circumstances, it is invoked in
> FTRACE_WARN_ON() and so there are many places where an unexpected bug
> could be triggered, yet the system may continue operating, possibly
> without the administrator noticing. If ftrace_kill() does not panic the
> system, then we should do everything we can to continue operating,
> rather than leave a ticking time bomb.
>
> Signed-off-by: Stephen Brennan <[email protected]>
> ---
> Changes in v3:
> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
> variable and check it directly in the kprobe handlers.

Reviewed-by: Steven Rostedt (Google) <[email protected]>

Thanks,

-- Steve

2024-05-01 17:35:46

by Guo Ren

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

On Thu, May 2, 2024 at 12:30 AM Stephen Brennan
<[email protected]> wrote:
>
> If an error happens in ftrace, ftrace_kill() will prevent disarming
> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> freed, yet the kprobes will still be active, and when triggered, they
> will use the freed memory, likely resulting in a page fault and panic.
>
> This behavior can be reproduced quite easily, by creating a kprobe and
> then triggering a ftrace_kill(). For simplicity, we can simulate an
> ftrace error with a kernel module like [1]:
>
> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>
> sudo perf probe --add commit_creds
> sudo perf trace -e probe:commit_creds
> # In another terminal
> make
> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
> # Back to perf terminal
> # ctrl-c
> sudo perf probe --del commit_creds
>
> After a short period, a page fault and panic would occur as the kprobe
> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> is supposed to be used only in extreme circumstances, it is invoked in
> FTRACE_WARN_ON() and so there are many places where an unexpected bug
> could be triggered, yet the system may continue operating, possibly
> without the administrator noticing. If ftrace_kill() does not panic the
> system, then we should do everything we can to continue operating,
> rather than leave a ticking time bomb.
>
> Signed-off-by: Stephen Brennan <[email protected]>
> ---
> Changes in v3:
> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
> variable and check it directly in the kprobe handlers.
> Link to v1/v2 discussion:
> https://lore.kernel.org/all/[email protected]/
>
> arch/csky/kernel/probes/ftrace.c | 3 +++
> arch/loongarch/kernel/ftrace_dyn.c | 3 +++
> arch/parisc/kernel/ftrace.c | 3 +++
> arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
> arch/riscv/kernel/probes/ftrace.c | 3 +++
> arch/s390/kernel/ftrace.c | 3 +++
> arch/x86/kernel/kprobes/ftrace.c | 3 +++
> include/linux/kprobes.h | 7 +++++++
> kernel/kprobes.c | 6 ++++++
> kernel/trace/ftrace.c | 1 +
> 10 files changed, 35 insertions(+)
>
> diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
> index 834cffcfbce3..7ba4b98076de 100644
> --- a/arch/csky/kernel/probes/ftrace.c
> +++ b/arch/csky/kernel/probes/ftrace.c
> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe_ctlblk *kcb;
> struct pt_regs *regs;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
For csky part.
Acked-by: Guo Ren <[email protected]>

> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
> index 73858c9029cc..bff058317062 100644
> --- a/arch/loongarch/kernel/ftrace_dyn.c
> +++ b/arch/loongarch/kernel/ftrace_dyn.c
> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe *p;
> struct kprobe_ctlblk *kcb;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
> index 621a4b386ae4..c91f9c2e61ed 100644
> --- a/arch/parisc/kernel/ftrace.c
> +++ b/arch/parisc/kernel/ftrace.c
> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe *p;
> int bit;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> index 072ebe7f290b..f8208c027148 100644
> --- a/arch/powerpc/kernel/kprobes-ftrace.c
> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> struct pt_regs *regs;
> int bit;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(nip, parent_nip);
> if (bit < 0)
> return;
> diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
> index 7142ec42e889..a69dfa610aa8 100644
> --- a/arch/riscv/kernel/probes/ftrace.c
> +++ b/arch/riscv/kernel/probes/ftrace.c
> @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe_ctlblk *kcb;
> int bit;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> index c46381ea04ec..7f6f8c438c26 100644
> --- a/arch/s390/kernel/ftrace.c
> +++ b/arch/s390/kernel/ftrace.c
> @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe *p;
> int bit;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
> index dd2ec14adb77..15af7e98e161 100644
> --- a/arch/x86/kernel/kprobes/ftrace.c
> +++ b/arch/x86/kernel/kprobes/ftrace.c
> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct kprobe_ctlblk *kcb;
> int bit;
>
> + if (unlikely(kprobe_ftrace_disabled))
> + return;
> +
> bit = ftrace_test_recursion_trylock(ip, parent_ip);
> if (bit < 0)
> return;
> diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> index 0ff44d6633e3..5fcbc254d186 100644
> --- a/include/linux/kprobes.h
> +++ b/include/linux/kprobes.h
> @@ -378,11 +378,15 @@ static inline void wait_for_kprobe_optimizer(void) { }
> extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> struct ftrace_ops *ops, struct ftrace_regs *fregs);
> extern int arch_prepare_kprobe_ftrace(struct kprobe *p);
> +/* Set when ftrace has been killed: kprobes on ftrace must be disabled for safety */
> +extern bool kprobe_ftrace_disabled __read_mostly;
> +extern void kprobe_ftrace_kill(void);
> #else
> static inline int arch_prepare_kprobe_ftrace(struct kprobe *p)
> {
> return -EINVAL;
> }
> +static inline void kprobe_ftrace_kill(void) {}
> #endif /* CONFIG_KPROBES_ON_FTRACE */
>
> /* Get the kprobe at this addr (if any) - called with preemption disabled */
> @@ -495,6 +499,9 @@ static inline void kprobe_flush_task(struct task_struct *tk)
> static inline void kprobe_free_init_mem(void)
> {
> }
> +static inline void kprobe_ftrace_kill(void)
> +{
> +}
> static inline int disable_kprobe(struct kprobe *kp)
> {
> return -EOPNOTSUPP;
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 65adc815fc6e..166ebf81dc45 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -1068,6 +1068,7 @@ static struct ftrace_ops kprobe_ipmodify_ops __read_mostly = {
>
> static int kprobe_ipmodify_enabled;
> static int kprobe_ftrace_enabled;
> +bool kprobe_ftrace_disabled;
>
> static int __arm_kprobe_ftrace(struct kprobe *p, struct ftrace_ops *ops,
> int *cnt)
> @@ -1136,6 +1137,11 @@ static int disarm_kprobe_ftrace(struct kprobe *p)
> ipmodify ? &kprobe_ipmodify_ops : &kprobe_ftrace_ops,
> ipmodify ? &kprobe_ipmodify_enabled : &kprobe_ftrace_enabled);
> }
> +
> +void kprobe_ftrace_kill()
> +{
> + kprobe_ftrace_disabled = true;
> +}
> #else /* !CONFIG_KPROBES_ON_FTRACE */
> static inline int arm_kprobe_ftrace(struct kprobe *p)
> {
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index da1710499698..96db99c347b3 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -7895,6 +7895,7 @@ void ftrace_kill(void)
> ftrace_disabled = 1;
> ftrace_enabled = 0;
> ftrace_trace_function = ftrace_stub;
> + kprobe_ftrace_kill();
> }
>
> /**
> --
> 2.39.3
>


--
Best Regards
Guo Ren

2024-05-02 02:04:08

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

On Thu, 2 May 2024 01:35:16 +0800
Guo Ren <[email protected]> wrote:

> On Thu, May 2, 2024 at 12:30 AM Stephen Brennan
> <[email protected]> wrote:
> >
> > If an error happens in ftrace, ftrace_kill() will prevent disarming
> > kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> > freed, yet the kprobes will still be active, and when triggered, they
> > will use the freed memory, likely resulting in a page fault and panic.
> >
> > This behavior can be reproduced quite easily, by creating a kprobe and
> > then triggering a ftrace_kill(). For simplicity, we can simulate an
> > ftrace error with a kernel module like [1]:
> >
> > [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
> >
> > sudo perf probe --add commit_creds
> > sudo perf trace -e probe:commit_creds
> > # In another terminal
> > make
> > sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
> > # Back to perf terminal
> > # ctrl-c
> > sudo perf probe --del commit_creds
> >
> > After a short period, a page fault and panic would occur as the kprobe
> > continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> > is supposed to be used only in extreme circumstances, it is invoked in
> > FTRACE_WARN_ON() and so there are many places where an unexpected bug
> > could be triggered, yet the system may continue operating, possibly
> > without the administrator noticing. If ftrace_kill() does not panic the
> > system, then we should do everything we can to continue operating,
> > rather than leave a ticking time bomb.
> >
> > Signed-off-by: Stephen Brennan <[email protected]>
> > ---
> > Changes in v3:
> > Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
> > variable and check it directly in the kprobe handlers.
> > Link to v1/v2 discussion:
> > https://lore.kernel.org/all/[email protected]/
> >
> > arch/csky/kernel/probes/ftrace.c | 3 +++
> > arch/loongarch/kernel/ftrace_dyn.c | 3 +++
> > arch/parisc/kernel/ftrace.c | 3 +++
> > arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
> > arch/riscv/kernel/probes/ftrace.c | 3 +++
> > arch/s390/kernel/ftrace.c | 3 +++
> > arch/x86/kernel/kprobes/ftrace.c | 3 +++
> > include/linux/kprobes.h | 7 +++++++
> > kernel/kprobes.c | 6 ++++++
> > kernel/trace/ftrace.c | 1 +
> > 10 files changed, 35 insertions(+)
> >
> > diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
> > index 834cffcfbce3..7ba4b98076de 100644
> > --- a/arch/csky/kernel/probes/ftrace.c
> > +++ b/arch/csky/kernel/probes/ftrace.c
> > @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe_ctlblk *kcb;
> > struct pt_regs *regs;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> For csky part.
> Acked-by: Guo Ren <[email protected]>

Thanks Stephen, Guo and Steve!

Let me pick this to probes/for-next!

Thank you,

>
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/loongarch/kernel/ftrace_dyn.c b/arch/loongarch/kernel/ftrace_dyn.c
> > index 73858c9029cc..bff058317062 100644
> > --- a/arch/loongarch/kernel/ftrace_dyn.c
> > +++ b/arch/loongarch/kernel/ftrace_dyn.c
> > @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe *p;
> > struct kprobe_ctlblk *kcb;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
> > index 621a4b386ae4..c91f9c2e61ed 100644
> > --- a/arch/parisc/kernel/ftrace.c
> > +++ b/arch/parisc/kernel/ftrace.c
> > @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe *p;
> > int bit;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
> > index 072ebe7f290b..f8208c027148 100644
> > --- a/arch/powerpc/kernel/kprobes-ftrace.c
> > +++ b/arch/powerpc/kernel/kprobes-ftrace.c
> > @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
> > struct pt_regs *regs;
> > int bit;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(nip, parent_nip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/riscv/kernel/probes/ftrace.c b/arch/riscv/kernel/probes/ftrace.c
> > index 7142ec42e889..a69dfa610aa8 100644
> > --- a/arch/riscv/kernel/probes/ftrace.c
> > +++ b/arch/riscv/kernel/probes/ftrace.c
> > @@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe_ctlblk *kcb;
> > int bit;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
> > index c46381ea04ec..7f6f8c438c26 100644
> > --- a/arch/s390/kernel/ftrace.c
> > +++ b/arch/s390/kernel/ftrace.c
> > @@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe *p;
> > int bit;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
> > index dd2ec14adb77..15af7e98e161 100644
> > --- a/arch/x86/kernel/kprobes/ftrace.c
> > +++ b/arch/x86/kernel/kprobes/ftrace.c
> > @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct kprobe_ctlblk *kcb;
> > int bit;
> >
> > + if (unlikely(kprobe_ftrace_disabled))
> > + return;
> > +
> > bit = ftrace_test_recursion_trylock(ip, parent_ip);
> > if (bit < 0)
> > return;
> > diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
> > index 0ff44d6633e3..5fcbc254d186 100644
> > --- a/include/linux/kprobes.h
> > +++ b/include/linux/kprobes.h
> > @@ -378,11 +378,15 @@ static inline void wait_for_kprobe_optimizer(void) { }
> > extern void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
> > struct ftrace_ops *ops, struct ftrace_regs *fregs);
> > extern int arch_prepare_kprobe_ftrace(struct kprobe *p);
> > +/* Set when ftrace has been killed: kprobes on ftrace must be disabled for safety */
> > +extern bool kprobe_ftrace_disabled __read_mostly;
> > +extern void kprobe_ftrace_kill(void);
> > #else
> > static inline int arch_prepare_kprobe_ftrace(struct kprobe *p)
> > {
> > return -EINVAL;
> > }
> > +static inline void kprobe_ftrace_kill(void) {}
> > #endif /* CONFIG_KPROBES_ON_FTRACE */
> >
> > /* Get the kprobe at this addr (if any) - called with preemption disabled */
> > @@ -495,6 +499,9 @@ static inline void kprobe_flush_task(struct task_struct *tk)
> > static inline void kprobe_free_init_mem(void)
> > {
> > }
> > +static inline void kprobe_ftrace_kill(void)
> > +{
> > +}
> > static inline int disable_kprobe(struct kprobe *kp)
> > {
> > return -EOPNOTSUPP;
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 65adc815fc6e..166ebf81dc45 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -1068,6 +1068,7 @@ static struct ftrace_ops kprobe_ipmodify_ops __read_mostly = {
> >
> > static int kprobe_ipmodify_enabled;
> > static int kprobe_ftrace_enabled;
> > +bool kprobe_ftrace_disabled;
> >
> > static int __arm_kprobe_ftrace(struct kprobe *p, struct ftrace_ops *ops,
> > int *cnt)
> > @@ -1136,6 +1137,11 @@ static int disarm_kprobe_ftrace(struct kprobe *p)
> > ipmodify ? &kprobe_ipmodify_ops : &kprobe_ftrace_ops,
> > ipmodify ? &kprobe_ipmodify_enabled : &kprobe_ftrace_enabled);
> > }
> > +
> > +void kprobe_ftrace_kill()
> > +{
> > + kprobe_ftrace_disabled = true;
> > +}
> > #else /* !CONFIG_KPROBES_ON_FTRACE */
> > static inline int arm_kprobe_ftrace(struct kprobe *p)
> > {
> > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> > index da1710499698..96db99c347b3 100644
> > --- a/kernel/trace/ftrace.c
> > +++ b/kernel/trace/ftrace.c
> > @@ -7895,6 +7895,7 @@ void ftrace_kill(void)
> > ftrace_disabled = 1;
> > ftrace_enabled = 0;
> > ftrace_trace_function = ftrace_stub;
> > + kprobe_ftrace_kill();
> > }
> >
> > /**
> > --
> > 2.39.3
> >
>
>
> --
> Best Regards
> Guo Ren


--
Masami Hiramatsu (Google) <[email protected]>

2024-05-06 14:48:18

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed



Le 01/05/2024 à 18:29, Stephen Brennan a écrit :
> If an error happens in ftrace, ftrace_kill() will prevent disarming
> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
> freed, yet the kprobes will still be active, and when triggered, they
> will use the freed memory, likely resulting in a page fault and panic.
>
> This behavior can be reproduced quite easily, by creating a kprobe and
> then triggering a ftrace_kill(). For simplicity, we can simulate an
> ftrace error with a kernel module like [1]:
>
> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>
> sudo perf probe --add commit_creds
> sudo perf trace -e probe:commit_creds
> # In another terminal
> make
> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
> # Back to perf terminal
> # ctrl-c
> sudo perf probe --del commit_creds
>
> After a short period, a page fault and panic would occur as the kprobe
> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
> is supposed to be used only in extreme circumstances, it is invoked in
> FTRACE_WARN_ON() and so there are many places where an unexpected bug
> could be triggered, yet the system may continue operating, possibly
> without the administrator noticing. If ftrace_kill() does not panic the
> system, then we should do everything we can to continue operating,
> rather than leave a ticking time bomb.
>
> Signed-off-by: Stephen Brennan <[email protected]>
> ---
> Changes in v3:
> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
> variable and check it directly in the kprobe handlers.

Isn't it safer to provide a fonction rather than a direct access to a
variable ?

By the way, wouldn't it be more performant to use a static branch (jump
label) ?

Christophe

2024-05-06 22:50:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

On Mon, 6 May 2024 14:46:57 +0000
Christophe Leroy <[email protected]> wrote:

> Isn't it safer to provide a fonction rather than a direct access to a
> variable ?
>
> By the way, wouldn't it be more performant to use a static branch (jump
> label) ?

A static branch could work, but the point of this is that if ftrace
failed, it was likely due to an issue with text modification. Do we want to
stop it via text modification?

-- Steve

2024-05-07 18:56:37

by Stephen Brennan

[permalink] [raw]
Subject: Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

Christophe Leroy <[email protected]> writes:
> Le 01/05/2024 à 18:29, Stephen Brennan a écrit :
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>>
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>>
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>>
>> sudo perf probe --add commit_creds
>> sudo perf trace -e probe:commit_creds
>> # In another terminal
>> make
>> sudo insmod ftrace_killer.ko # calls ftrace_kill(), simulating bug
>> # Back to perf terminal
>> # ctrl-c
>> sudo perf probe --del commit_creds
>>
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>>
>> Signed-off-by: Stephen Brennan <[email protected]>
>> ---
>> Changes in v3:
>> Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
>> variable and check it directly in the kprobe handlers.
>
> Isn't it safer to provide a fonction rather than a direct access to a
> variable ?

Is the concern that other code could modify this variable? If so, then I
suppose the function call is safer. But the variable is not exported and
I think built-in code can be trusted not to muck with it. Maybe I'm
missing your point about safety though?

> By the way, wouldn't it be more performant to use a static branch (jump
> label) ?

I agree with Steven's concern that text modification would unfortunately
not be a good way to handle an error in text modification. Especially, I
believe there could be deadlock risks, as static key enablement requires
taking the text_mutex and the jump_label_mutex. I'd be concerned that
the text_mutex could already be held in some situations where
ftrace_kill() is called. But I'm not certain about that.

Thanks for taking a look!
Stephen