When kernel panic occurs, a kernel module can use either the
panic_notifier or die_notifier to obtain the debugging information.
However, in case of these exceptions like do_undefinstr(), regs and
esr data are not passed on. Although a module might be able to find
those data in the console messages, parsing text messages is very
expensive behavior for a module especially on mobile devices.
These bare tracepoints allow a module to probe regs and esr information
for debugging purpose. _tp suffix comes from bare tracepoints of
sched/core.c
Signed-off-by: Sangmoon Kim <[email protected]>
---
arch/arm64/kernel/traps.c | 15 +++++++++++++
include/trace/events/traps.h | 41 ++++++++++++++++++++++++++++++++++++
2 files changed, 56 insertions(+)
create mode 100644 include/trace/events/traps.h
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index a05d34f0e82a..e58797743442 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -45,6 +45,17 @@
#include <asm/system_misc.h>
#include <asm/sysreg.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/traps.h>
+
+/*
+ * Export tracepoints that act as a bare tracehook (ie: have no trace event
+ * associated with them) to allow external modules to probe them.
+ */
+EXPORT_TRACEPOINT_SYMBOL_GPL(traps_undefinstr_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(traps_bad_mode_tp);
+EXPORT_TRACEPOINT_SYMBOL_GPL(traps_serror_panic_tp);
+
static const char *handler[] = {
"Synchronous Abort",
"IRQ",
@@ -403,6 +414,8 @@ void do_undefinstr(struct pt_regs *regs)
if (call_undef_hook(regs) == 0)
return;
+ if (!user_mode(regs))
+ trace_traps_undefinstr_tp(regs, 0);
BUG_ON(!user_mode(regs));
force_signal_inject(SIGILL, ILL_ILLOPC, regs->pc, 0);
}
@@ -766,6 +779,7 @@ asmlinkage void notrace bad_mode(struct pt_regs *regs, int reason, unsigned int
__show_regs(regs);
local_daif_mask();
+ trace_traps_bad_mode_tp(regs, esr);
panic("bad mode");
}
@@ -832,6 +846,7 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
if (regs)
__show_regs(regs);
+ trace_traps_serror_panic_tp(regs, esr);
nmi_panic(regs, "Asynchronous SError Interrupt");
cpu_park_loop();
diff --git a/include/trace/events/traps.h b/include/trace/events/traps.h
new file mode 100644
index 000000000000..b4f53db90ce7
--- /dev/null
+++ b/include/trace/events/traps.h
@@ -0,0 +1,41 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM traps
+
+#if !defined(_TRACE_TRAPS_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_TRAPS_H
+
+#include <linux/tracepoint.h>
+
+/*
+ * Following tracepoints are not exported in tracefs and provide hooking
+ * mechanisms only for testing and debugging purposes.
+ *
+ * Postfixed with _tp to make them easily identifiable in the code.
+ */
+
+/*
+ * Tracepoint for unhandled undefined exception at EL1.
+ */
+DECLARE_TRACE(traps_undefinstr_tp,
+ TP_PROTO(struct pt_regs *regs, unsigned int esr),
+ TP_ARGS(regs, esr));
+
+/*
+ * Tracepoint for bad_mode for the impossible case in the exception vector.
+ */
+DECLARE_TRACE(traps_bad_mode_tp,
+ TP_PROTO(struct pt_regs *regs, unsigned int esr),
+ TP_ARGS(regs, esr));
+
+/*
+ * Tracepoint for unhandled SError exception.
+ */
+DECLARE_TRACE(traps_serror_panic_tp,
+ TP_PROTO(struct pt_regs *regs, unsigned int esr),
+ TP_ARGS(regs, esr));
+
+#endif /* _TRACE_TRAPS_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
--
2.17.1
On Fri, Mar 05, 2021 at 09:36:30PM +0900, Sangmoon Kim wrote:
> When kernel panic occurs, a kernel module can use either the
> panic_notifier or die_notifier to obtain the debugging information.
> However, in case of these exceptions like do_undefinstr(), regs and
> esr data are not passed on. Although a module might be able to find
> those data in the console messages, parsing text messages is very
> expensive behavior for a module especially on mobile devices.
> These bare tracepoints allow a module to probe regs and esr information
> for debugging purpose. _tp suffix comes from bare tracepoints of
> sched/core.c
This use case sounds a lot like what the enterprise and Android people
do via pstore - it seems like it would be better for this to integrate
via the interfaces that other systems are using for similar purposes and
then ensure that whatever information is useful is getting passed
through in a format that makes sense. That'd be more structured and
more readily usable by a wider range of systems than something that's
more of a building block, going via the trace infrastructure seems like
a bit of an indirection.
> @@ -832,6 +846,7 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
> if (regs)
> __show_regs(regs);
>
> + trace_traps_serror_panic_tp(regs, esr);
> nmi_panic(regs, "Asynchronous SError Interrupt");
One of the concerns people have with adding tracepoints is that they can
end up defining ABI so if we *are* going to add any then we need to
think carefully about how they're defined. As things currently stand
they'll pass in the full pt_regs struct which includes not only what's
defined by the hardware but also additional software defined information
we store along with it like the stackframe which would be even more of a
problem if it ends up getting used by someone in a way that ends up as
ABI. These are defined as bare tracehooks which does mitigate against
things ending up getting used in ways that cause problems but people are
still going to worry about things ending up getting relied on one way or
another.
That said it's not clear to me that this will record anything beyond the
pointer directly in the trace buffer so the value might not be useful
for terribly long, that itself feels like it might not be as robust an
interface as it should be.
On Mon, 8 Mar 2021 13:31:49 +0000
Mark Brown <[email protected]> wrote:
> > @@ -832,6 +846,7 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
> > if (regs)
> > __show_regs(regs);
> >
> > + trace_traps_serror_panic_tp(regs, esr);
> > nmi_panic(regs, "Asynchronous SError Interrupt");
>
> One of the concerns people have with adding tracepoints is that they can
> end up defining ABI so if we *are* going to add any then we need to
> think carefully about how they're defined. As things currently stand
> they'll pass in the full pt_regs struct which includes not only what's
> defined by the hardware but also additional software defined information
> we store along with it like the stackframe which would be even more of a
> problem if it ends up getting used by someone in a way that ends up as
> ABI. These are defined as bare tracehooks which does mitigate against
> things ending up getting used in ways that cause problems but people are
> still going to worry about things ending up getting relied on one way or
> another.
>
> That said it's not clear to me that this will record anything beyond the
> pointer directly in the trace buffer so the value might not be useful
> for terribly long, that itself feels like it might not be as robust an
> interface as it should be.
BTW, we are working on an "event probe". That is similar to kprobe
event, but attaches to the output of an event to create another event.
Thus, if you had a trace event that was like this:
trace_regs(pt_regs *regs);
Where you save the regs pointer for output:
TP_STRUCT__entry(
__field(void *, regs )
),
TP_fast_assign(
__entry->regs = regs;
)
Then you would be able to get access to all the regs for that tracepoint!
# echo 'e:pt_regs regs ip=+8(regs):x64' > /sys/kernel/tracing/kprobe_events
Where "e:" denotes that this connects to a trace event, "pt_regs" is the
event name created under kprobes subsystem, "regs" is the trace event to
attach to, "ip=" is what will be printed, and "+8(regs):x64" is a way to
dereference the regs pointer at the time of the trace event is executed.
That is, it will dereference 8 bytes from the pointer and return a x64 hex
number.
Thus, trace events like this may be very useful in the near future.
-- Steve
> -----Original Message-----
> From: Mark Brown <[email protected]>
> Sent: Monday, March 8, 2021 10:32 PM
> Subject: Re: [PATCH] arm64: traps: add tracepoints for unusal exception cases
>
> On Fri, Mar 05, 2021 at 09:36:30PM +0900, Sangmoon Kim wrote:
>
> > When kernel panic occurs, a kernel module can use either the
> > panic_notifier or die_notifier to obtain the debugging information.
>
> > However, in case of these exceptions like do_undefinstr(), regs and
> > esr data are not passed on. Although a module might be able to find
> > those data in the console messages, parsing text messages is very
> > expensive behavior for a module especially on mobile devices.
>
> > These bare tracepoints allow a module to probe regs and esr information
> > for debugging purpose. _tp suffix comes from bare tracepoints of
> > sched/core.c
>
> This use case sounds a lot like what the enterprise and Android people
> do via pstore - it seems like it would be better for this to integrate
> via the interfaces that other systems are using for similar purposes and
> then ensure that whatever information is useful is getting passed
> through in a format that makes sense. That'd be more structured and
> more readily usable by a wider range of systems than something that's
> more of a building block, going via the trace infrastructure seems like
> a bit of an indirection.
>
> > @@ -832,6 +846,7 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
> > if (regs)
> > __show_regs(regs);
> >
> > + trace_traps_serror_panic_tp(regs, esr);
> > nmi_panic(regs, "Asynchronous SError Interrupt");
>
> One of the concerns people have with adding tracepoints is that they can
> end up defining ABI so if we *are* going to add any then we need to
> think carefully about how they're defined. As things currently stand
> they'll pass in the full pt_regs struct which includes not only what's
> defined by the hardware but also additional software defined information
> we store along with it like the stackframe which would be even more of a
> problem if it ends up getting used by someone in a way that ends up as
> ABI. These are defined as bare tracehooks which does mitigate against
> things ending up getting used in ways that cause problems but people are
> still going to worry about things ending up getting relied on one way or
> another.
>
> That said it's not clear to me that this will record anything beyond the
> pointer directly in the trace buffer so the value might not be useful
> for terribly long, that itself feels like it might not be as robust an
> interface as it should be.
Dear Mark,
Thank you for your review. I learned a lot about the concerns when using
tracepoint.
Thanks,
Sangmoon