Hi,
Here is the 2nd version of the kprobe bugfixes on arm64. The previous
series is here;
https://lore.kernel.org/all/166990553243.253128.13594802750635478633.stgit@devnote3/T/#u
The first one is a critical issue, which will cause a kernel crach
easily with lockdep[1/3]. Others are minor issues and rare cases.
The [2/3] let do_page_fault() fixup the page fault in kprobes user
handler, and [3/3] is more like code cleanup and returns
DBG_HOOK_ERROR if it can not handle kprobe's BRK (but that
should not happen.)
In this version I update to use noinstr and __always_inline [1/3]
and update comments[3/3]. Also add Mark's Ack (Thanks!)
Thank you,
---
Masami Hiramatsu (Google) (3):
arm64: Prohibit instrumentation on arch_stack_walk()
arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
arch/arm64/kernel/probes/kprobes.c | 86 +++++++++++++++---------------------
arch/arm64/kernel/stacktrace.c | 10 ++--
2 files changed, 41 insertions(+), 55 deletions(-)
--
Masami Hiramatsu (Google) <[email protected]>
From: Masami Hiramatsu (Google) <[email protected]>
Mark arch_stack_walk() as noinstr instead of notrace and inline functions
called from arch_stack_walk() as __always_inline so that user does not
put any instrumentations on it, because this function can be used from
return_address() which is used by lockdep.
Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
arm64.
# echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
# echo 1 > ${TRACEFS}/events/kprobes/enable
kprobes: Failed to recover from reentered kprobes.
kprobes: Dump kprobe:
.symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
------------[ cut here ]------------
kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
kprobes: Failed to recover from reentered kprobes.
kprobes: Dump kprobe:
.symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
------------[ cut here ]------------
kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
PREEMPT SMP
Modules linked in:
CPU: 0 PID: 17 Comm: migration/0 Tainted: G N 6.1.0-rc5+ #6
Hardware name: linux,dummy-virt (DT)
Stopper: 0x0 <- 0x0
pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : kprobe_breakpoint_handler+0x178/0x17c
lr : kprobe_breakpoint_handler+0x178/0x17c
sp : ffff8000080d3090
x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
Call trace:
kprobes: Failed to recover from reentered kprobes.
kprobes: Dump kprobe:
.symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
------------[ cut here ]------------
kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
Fixes: 39ef362d2d45 ("arm64: Make return_address() use arch_stack_walk()")
Cc: [email protected]
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
---
Changes in v2:
- Use noinstr instead of NOKPROBE_SYMBOL()
- Use __always_inline because nokprobe_inline will be changed if
CONFIG_KPROBES=n.
- Fix indentation.
---
arch/arm64/kernel/stacktrace.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 634279b3b03d..117e2c180f3c 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -23,8 +23,8 @@
*
* The regs must be on a stack currently owned by the calling task.
*/
-static inline void unwind_init_from_regs(struct unwind_state *state,
- struct pt_regs *regs)
+static __always_inline void unwind_init_from_regs(struct unwind_state *state,
+ struct pt_regs *regs)
{
unwind_init_common(state, current);
@@ -58,8 +58,8 @@ static __always_inline void unwind_init_from_caller(struct unwind_state *state)
* duration of the unwind, or the unwind will be bogus. It is never valid to
* call this for the current task.
*/
-static inline void unwind_init_from_task(struct unwind_state *state,
- struct task_struct *task)
+static __always_inline void unwind_init_from_task(struct unwind_state *state,
+ struct task_struct *task)
{
unwind_init_common(state, task);
@@ -186,7 +186,7 @@ void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
: stackinfo_get_unknown(); \
})
-noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
+noinline noinstr void arch_stack_walk(stack_trace_consume_fn consume_entry,
void *cookie, struct task_struct *task,
struct pt_regs *regs)
{
On Fri, Dec 02, 2022 at 11:18:33AM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <[email protected]>
>
> Mark arch_stack_walk() as noinstr instead of notrace and inline functions
> called from arch_stack_walk() as __always_inline so that user does not
> put any instrumentations on it, because this function can be used from
> return_address() which is used by lockdep.
Hmmm... since arch_stack_walk is marked as notrace, that will be prohibited by
default unless the kernel was built with CONFIG_KPROBE_EVENTS_ON_NOTRACE=y,
and the commit message for that says:
│ This option disables such protection and allows you to put kprobe │
│ events on ftrace functions for debugging ftrace by itself. │
│ Note that this might let you shoot yourself in the foot.
... and IIUC we generally don't expect people to set that, and that might be
worth calling out in the commit message.
> Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
> arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
> arm64.
>
> # echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
> # echo 1 > ${TRACEFS}/events/kprobes/enable
I had a go at testing this patch, and it fixes the crash with the reproducer
above, but there are plenty of other instances in stacktrace.c that lead to the
same sort of crash, e.g.
# echo p stackinfo_get_task >> ${TRACEFS}/kprobe_events
# echo 1 > ${TRACEFS}/events/kprobes/enable
... so I think there's more that we need to do to fix this generally.
Note: I found interesting functions to try tracing by looking at the output of:
aarch64-linux-objdump -t arch/arm64/kernel/stacktrace.o | grep -w '.text'
That all said, I think this patch is nice-to-have, and that we can address the
other cases as a follow-up, so for this patch as-is (with or without some
wording regarding CONFIG_KPROBE_EVENTS_ON_NOTRACE):
Acked-by: Mark Rutland <[email protected]>
Mark.
> kprobes: Failed to recover from reentered kprobes.
> kprobes: Dump kprobe:
> .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> ------------[ cut here ]------------
> kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
> kprobes: Failed to recover from reentered kprobes.
> kprobes: Dump kprobe:
> .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> ------------[ cut here ]------------
> kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
> PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 17 Comm: migration/0 Tainted: G N 6.1.0-rc5+ #6
> Hardware name: linux,dummy-virt (DT)
> Stopper: 0x0 <- 0x0
> pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : kprobe_breakpoint_handler+0x178/0x17c
> lr : kprobe_breakpoint_handler+0x178/0x17c
> sp : ffff8000080d3090
> x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
> x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
> x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
> x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
> x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
> x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
> x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
> x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
> x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
> x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
> Call trace:
> kprobes: Failed to recover from reentered kprobes.
> kprobes: Dump kprobe:
> .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> ------------[ cut here ]------------
> kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
>
> Fixes: 39ef362d2d45 ("arm64: Make return_address() use arch_stack_walk()")
> Cc: [email protected]
> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> ---
> Changes in v2:
> - Use noinstr instead of NOKPROBE_SYMBOL()
> - Use __always_inline because nokprobe_inline will be changed if
> CONFIG_KPROBES=n.
> - Fix indentation.
> ---
> arch/arm64/kernel/stacktrace.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 634279b3b03d..117e2c180f3c 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -23,8 +23,8 @@
> *
> * The regs must be on a stack currently owned by the calling task.
> */
> -static inline void unwind_init_from_regs(struct unwind_state *state,
> - struct pt_regs *regs)
> +static __always_inline void unwind_init_from_regs(struct unwind_state *state,
> + struct pt_regs *regs)
> {
> unwind_init_common(state, current);
>
> @@ -58,8 +58,8 @@ static __always_inline void unwind_init_from_caller(struct unwind_state *state)
> * duration of the unwind, or the unwind will be bogus. It is never valid to
> * call this for the current task.
> */
> -static inline void unwind_init_from_task(struct unwind_state *state,
> - struct task_struct *task)
> +static __always_inline void unwind_init_from_task(struct unwind_state *state,
> + struct task_struct *task)
> {
> unwind_init_common(state, task);
>
> @@ -186,7 +186,7 @@ void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
> : stackinfo_get_unknown(); \
> })
>
> -noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
> +noinline noinstr void arch_stack_walk(stack_trace_consume_fn consume_entry,
> void *cookie, struct task_struct *task,
> struct pt_regs *regs)
> {
>
On Fri, 2 Dec 2022 11:18:23 +0900, Masami Hiramatsu (Google) wrote:
> Here is the 2nd version of the kprobe bugfixes on arm64. The previous
> series is here;
>
> https://lore.kernel.org/all/166990553243.253128.13594802750635478633.stgit@devnote3/T/#u
>
> The first one is a critical issue, which will cause a kernel crach
> easily with lockdep[1/3]. Others are minor issues and rare cases.
> The [2/3] let do_page_fault() fixup the page fault in kprobes user
> handler, and [3/3] is more like code cleanup and returns
> DBG_HOOK_ERROR if it can not handle kprobe's BRK (but that
> should not happen.)
>
> [...]
Applied to arm64 (for-next/kprobes), thanks!
[1/3] arm64: Prohibit instrumentation on arch_stack_walk()
https://git.kernel.org/arm64/c/0fbcd8abf337
[2/3] arm64: kprobes: Let arch do_page_fault() fix up page fault in user handler
https://git.kernel.org/arm64/c/30a421552373
[3/3] arm64: kprobes: Return DBG_HOOK_ERROR if kprobes can not handle a BRK
https://git.kernel.org/arm64/c/3b84efc066f8
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
On Fri, 2 Dec 2022 13:17:30 +0000
Mark Rutland <[email protected]> wrote:
> On Fri, Dec 02, 2022 at 11:18:33AM +0900, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <[email protected]>
> >
> > Mark arch_stack_walk() as noinstr instead of notrace and inline functions
> > called from arch_stack_walk() as __always_inline so that user does not
> > put any instrumentations on it, because this function can be used from
> > return_address() which is used by lockdep.
>
> Hmmm... since arch_stack_walk is marked as notrace, that will be prohibited by
> default unless the kernel was built with CONFIG_KPROBE_EVENTS_ON_NOTRACE=y,
> and the commit message for that says:
>
> │ This option disables such protection and allows you to put kprobe │
> │ events on ftrace functions for debugging ftrace by itself. │
> │ Note that this might let you shoot yourself in the foot.
>
> ... and IIUC we generally don't expect people to set that, and that might be
> worth calling out in the commit message.
Yeah, but that is the feature for kprobe events, and not applied to
the kprobes itself. Maybe I should use a test kprobe module to
put a probe on it.
>
> > Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
> > arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
> > arm64.
> >
> > # echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
> > # echo 1 > ${TRACEFS}/events/kprobes/enable
>
> I had a go at testing this patch, and it fixes the crash with the reproducer
> above, but there are plenty of other instances in stacktrace.c that lead to the
> same sort of crash, e.g.
>
> # echo p stackinfo_get_task >> ${TRACEFS}/kprobe_events
> # echo 1 > ${TRACEFS}/events/kprobes/enable
Oops, thanks for pointing! Hmm, I thought stackinfo_get_task() is an
inlined function usually. Maybe we should make it nokprobe_inline.
But this is just one case. I need to scan all symbols to trace...
>
> ... so I think there's more that we need to do to fix this generally.
>
> Note: I found interesting functions to try tracing by looking at the output of:
>
> aarch64-linux-objdump -t arch/arm64/kernel/stacktrace.o | grep -w '.text'
>
> That all said, I think this patch is nice-to-have, and that we can address the
> other cases as a follow-up, so for this patch as-is (with or without some
> wording regarding CONFIG_KPROBE_EVENTS_ON_NOTRACE):
>
> Acked-by: Mark Rutland <[email protected]>
Thanks!
>
> Mark.
>
> > kprobes: Failed to recover from reentered kprobes.
> > kprobes: Dump kprobe:
> > .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> > ------------[ cut here ]------------
> > kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
> > kprobes: Failed to recover from reentered kprobes.
> > kprobes: Dump kprobe:
> > .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> > ------------[ cut here ]------------
> > kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
> > PREEMPT SMP
> > Modules linked in:
> > CPU: 0 PID: 17 Comm: migration/0 Tainted: G N 6.1.0-rc5+ #6
> > Hardware name: linux,dummy-virt (DT)
> > Stopper: 0x0 <- 0x0
> > pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : kprobe_breakpoint_handler+0x178/0x17c
> > lr : kprobe_breakpoint_handler+0x178/0x17c
> > sp : ffff8000080d3090
> > x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
> > x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
> > x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
> > x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
> > x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
> > x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
> > x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
> > x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
> > x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
> > x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
> > Call trace:
> > kprobes: Failed to recover from reentered kprobes.
> > kprobes: Dump kprobe:
> > .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
> > ------------[ cut here ]------------
> > kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
> >
> > Fixes: 39ef362d2d45 ("arm64: Make return_address() use arch_stack_walk()")
> > Cc: [email protected]
> > Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
> > ---
> > Changes in v2:
> > - Use noinstr instead of NOKPROBE_SYMBOL()
> > - Use __always_inline because nokprobe_inline will be changed if
> > CONFIG_KPROBES=n.
> > - Fix indentation.
> > ---
> > arch/arm64/kernel/stacktrace.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index 634279b3b03d..117e2c180f3c 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -23,8 +23,8 @@
> > *
> > * The regs must be on a stack currently owned by the calling task.
> > */
> > -static inline void unwind_init_from_regs(struct unwind_state *state,
> > - struct pt_regs *regs)
> > +static __always_inline void unwind_init_from_regs(struct unwind_state *state,
> > + struct pt_regs *regs)
> > {
> > unwind_init_common(state, current);
> >
> > @@ -58,8 +58,8 @@ static __always_inline void unwind_init_from_caller(struct unwind_state *state)
> > * duration of the unwind, or the unwind will be bogus. It is never valid to
> > * call this for the current task.
> > */
> > -static inline void unwind_init_from_task(struct unwind_state *state,
> > - struct task_struct *task)
> > +static __always_inline void unwind_init_from_task(struct unwind_state *state,
> > + struct task_struct *task)
> > {
> > unwind_init_common(state, task);
> >
> > @@ -186,7 +186,7 @@ void show_stack(struct task_struct *tsk, unsigned long *sp, const char *loglvl)
> > : stackinfo_get_unknown(); \
> > })
> >
> > -noinline notrace void arch_stack_walk(stack_trace_consume_fn consume_entry,
> > +noinline noinstr void arch_stack_walk(stack_trace_consume_fn consume_entry,
> > void *cookie, struct task_struct *task,
> > struct pt_regs *regs)
> > {
> >
--
Masami Hiramatsu (Google) <[email protected]>
On Tue, Dec 06, 2022 at 09:41:37AM +0900, Masami Hiramatsu wrote:
> On Fri, 2 Dec 2022 13:17:30 +0000
> Mark Rutland <[email protected]> wrote:
> > I had a go at testing this patch, and it fixes the crash with the reproducer
> > above, but there are plenty of other instances in stacktrace.c that lead to the
> > same sort of crash, e.g.
> >
> > # echo p stackinfo_get_task >> ${TRACEFS}/kprobe_events
> > # echo 1 > ${TRACEFS}/events/kprobes/enable
>
> Oops, thanks for pointing! Hmm, I thought stackinfo_get_task() is an
> inlined function usually. Maybe we should make it nokprobe_inline.
> But this is just one case. I need to scan all symbols to trace...
FWIW, due to other instrumentation issues I've started doing a larger noinstr
cleanup on arm64 which should address this, as fixing this properly involves
also modifying a bunch of underlying helpers (e.g. preempt_count()).
I can Cc you when sending that out, if you'd like? That'll probably be in the
new year.
Thanks,
Mark.
On Fri, 9 Dec 2022 11:16:34 +0000
Mark Rutland <[email protected]> wrote:
> On Tue, Dec 06, 2022 at 09:41:37AM +0900, Masami Hiramatsu wrote:
> > On Fri, 2 Dec 2022 13:17:30 +0000
> > Mark Rutland <[email protected]> wrote:
> > > I had a go at testing this patch, and it fixes the crash with the reproducer
> > > above, but there are plenty of other instances in stacktrace.c that lead to the
> > > same sort of crash, e.g.
> > >
> > > # echo p stackinfo_get_task >> ${TRACEFS}/kprobe_events
> > > # echo 1 > ${TRACEFS}/events/kprobes/enable
> >
> > Oops, thanks for pointing! Hmm, I thought stackinfo_get_task() is an
> > inlined function usually. Maybe we should make it nokprobe_inline.
> > But this is just one case. I need to scan all symbols to trace...
>
> FWIW, due to other instrumentation issues I've started doing a larger noinstr
> cleanup on arm64 which should address this, as fixing this properly involves
> also modifying a bunch of underlying helpers (e.g. preempt_count()).
>
> I can Cc you when sending that out, if you'd like? That'll probably be in the
> new year.
Thanks Mark, yes, I'm interested in that work.
Until that, I'll finish making a script for testing kprobes widely on
kernel symbols.
Thank you,
>
> Thanks,
> Mark.
--
Masami Hiramatsu (Google) <[email protected]>