We have a number of systems industry-wide that have a subset of their
functionality that works as follows:
1. Receive a message from local kmsg, serial console, or netconsole;
2. Apply a set of rules to classify the message;
3. Do something based on this classification (like scheduling a
remediation for the machine), rinse, and repeat.
As a couple of examples of places we have this implemented just inside
Facebook, although this isn't a Facebook-specific problem, we have this
inside our netconsole processing (for alarm classification), and as part
of our machine health checking. We use these messages to determine
fairly important metrics around production health, and it's important
that we get them right.
While for some kinds of issues we have counters, tracepoints, or metrics
with a stable interface which can reliably indicate the issue, in order
to react to production issues quickly we need to work with the interface
which most kernel developers naturally use when developing: printk.
Most production issues come from unexpected phenomena, and as such
usually the code in question doesn't have easily usable tracepoints or
other counters available for the specific problem being mitigated. We
have a number of lines of monitoring defence against problems in
production (host metrics, process metrics, service metrics, etc), and
where it's not feasible to reliably monitor at another level, this kind
of pragmatic netconsole monitoring is essential.
As you'd expect, monitoring using printk is rather brittle for a number
of reasons -- most notably that the message might disappear entirely in
a new version of the kernel, or that the message may change in some way
that the regex or other classification methods start to silently fail.
One factor that makes this even harder is that, under normal operation,
many of these messages are never expected to be hit. For example, there
may be some rare hardware bug which you want to detect if it was to ever
happen again, but its recurrence is not likely or anticipated. This
precludes using something like checking whether the printk in question
was printed somewhere fleetwide recently to determine whether the
message in question is still present or not, since we don't anticipate
that it should be printed anywhere, but still need to monitor for its
future presence in the long-term.
This class of issue has happened on a number of occasions, causing
unhealthy machines with hardware issues to remain in production for
longer than ideal. As a recent example, some monitoring around
blk_update_request fell out of date and caused semi-broken machines to
remain in production for longer than would be desirable.
Searching through the codebase to find the message is also extremely
fragile, because many of the messages are further constructed beyond
their callsite (eg. btrfs_printk and other module-specific wrappers,
each with their own functionality). Even if they aren't, guessing the
format and formulation of the underlying message based on the aesthetics
of the message emitted is not a recipe for success at scale, and our
previous issues with fleetwide machine health checking demonstrate as
much.
This patch provides a solution to the issue of silently changed or
deleted printks: we record pointers to all printk format strings known
at compile time into a new .printk_fmts section, both in vmlinux and
modules. At runtime, this can then be iterated by looking at
/proc/printk_formats, which emits the same format as `printk` itself,
which we already export elsewhere (for example, in netconsole):
# Format: <module>,<facility><level><format>\0
$ perl -p -e 's/\n/\\n/g;s/\0/\n/g' /proc/printk_formats | shuf -n 5
vmlinux,6Disabling APIC timer\n
intel_rapl_common,3intel_rapl_common: Cannot find matching power limit for constraint %d\n
dm_crypt,3device-mapper: crypt: %s: INTEGRITY AEAD ERROR, sector %llu\n
mac80211,6%s: AP bug: HT capability missing from AssocResp\n
vmlinux,3zpool: couldn't create zpool - out of memory\n
This mitigates the majority of cases where we have a highly-specific
printk which we want to match on, as we can now enumerate and check
whether the format changed or the printk callsite disappeared entirely
in userspace. This allows us to catch changes to printks we monitor
earlier and decide what to do about it before it becomes problematic.
There is no additional runtime cost for printk callers or printk itself,
and the assembly generated is exactly the same.
Signed-off-by: Chris Down <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: John Ogness <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Andrew Morton <[email protected]>
---
arch/arm/kernel/entry-v7m.S | 2 +-
arch/arm/lib/backtrace-clang.S | 2 +-
arch/arm/lib/backtrace.S | 2 +-
arch/arm/mach-rpc/io-acorn.S | 2 +-
arch/arm/vfp/vfphw.S | 6 +-
arch/openrisc/kernel/entry.S | 6 +-
arch/powerpc/kernel/head_fsl_booke.S | 2 +-
arch/x86/kernel/head_32.S | 2 +-
include/asm-generic/vmlinux.lds.h | 13 ++
include/linux/module.h | 5 +
include/linux/printk.h | 43 +++++-
init/Kconfig | 15 ++
kernel/module.c | 5 +
kernel/printk/printk.c | 196 ++++++++++++++++++++++++++-
14 files changed, 280 insertions(+), 21 deletions(-)
diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
index d0e898608d30..7bde93c10962 100644
--- a/arch/arm/kernel/entry-v7m.S
+++ b/arch/arm/kernel/entry-v7m.S
@@ -23,7 +23,7 @@ __invalid_entry:
adr r0, strerr
mrs r1, ipsr
mov r2, lr
- bl printk
+ bl _printk
#endif
mov r0, sp
bl show_regs
diff --git a/arch/arm/lib/backtrace-clang.S b/arch/arm/lib/backtrace-clang.S
index 6174c45f53a5..5b2cdb1003e3 100644
--- a/arch/arm/lib/backtrace-clang.S
+++ b/arch/arm/lib/backtrace-clang.S
@@ -202,7 +202,7 @@ finished_setup:
1006: adr r0, .Lbad
mov r1, loglvl
mov r2, frame
- bl printk
+ bl _printk
no_frame: ldmfd sp!, {r4 - r9, fp, pc}
ENDPROC(c_backtrace)
.pushsection __ex_table,"a"
diff --git a/arch/arm/lib/backtrace.S b/arch/arm/lib/backtrace.S
index 872f658638d9..e8408f22d4dc 100644
--- a/arch/arm/lib/backtrace.S
+++ b/arch/arm/lib/backtrace.S
@@ -103,7 +103,7 @@ for_each_frame: tst frame, mask @ Check for address exceptions
1006: adr r0, .Lbad
mov r1, loglvl
mov r2, frame
- bl printk
+ bl _printk
no_frame: ldmfd sp!, {r4 - r9, pc}
ENDPROC(c_backtrace)
diff --git a/arch/arm/mach-rpc/io-acorn.S b/arch/arm/mach-rpc/io-acorn.S
index b9082a2a2a01..aa9bf0d771c0 100644
--- a/arch/arm/mach-rpc/io-acorn.S
+++ b/arch/arm/mach-rpc/io-acorn.S
@@ -25,4 +25,4 @@ ENTRY(insl)
ENTRY(outsl)
adr r0, .Liosl_warning
mov r1, lr
- b printk
+ b _printk
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index d5837bf05a9a..6f7926c9c179 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -23,7 +23,7 @@
#ifdef DEBUG
stmfd sp!, {r0-r3, ip, lr}
ldr r0, =1f
- bl printk
+ bl _printk
ldmfd sp!, {r0-r3, ip, lr}
.pushsection .rodata, "a"
@@ -38,7 +38,7 @@
stmfd sp!, {r0-r3, ip, lr}
mov r1, \arg
ldr r0, =1f
- bl printk
+ bl _printk
ldmfd sp!, {r0-r3, ip, lr}
.pushsection .rodata, "a"
@@ -55,7 +55,7 @@
mov r2, \arg2
mov r1, \arg1
ldr r0, =1f
- bl printk
+ bl _printk
ldmfd sp!, {r0-r3, ip, lr}
.pushsection .rodata, "a"
diff --git a/arch/openrisc/kernel/entry.S b/arch/openrisc/kernel/entry.S
index bc657e55c15f..947613f61d4a 100644
--- a/arch/openrisc/kernel/entry.S
+++ b/arch/openrisc/kernel/entry.S
@@ -551,7 +551,7 @@ EXCEPTION_ENTRY(_external_irq_handler)
l.movhi r3,hi(42f)
l.ori r3,r3,lo(42f)
l.sw 0x0(r1),r3
- l.jal printk
+ l.jal _printk
l.sw 0x4(r1),r4
l.addi r1,r1,0x8
@@ -681,8 +681,8 @@ _syscall_debug:
l.sw -4(r1),r27
l.sw -8(r1),r11
l.addi r1,r1,-8
- l.movhi r27,hi(printk)
- l.ori r27,r27,lo(printk)
+ l.movhi r27,hi(_printk)
+ l.ori r27,r27,lo(_printk)
l.jalr r27
l.nop
l.addi r1,r1,8
diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
index fdd4d274c245..f2c7ecef9b14 100644
--- a/arch/powerpc/kernel/head_fsl_booke.S
+++ b/arch/powerpc/kernel/head_fsl_booke.S
@@ -852,7 +852,7 @@ KernelSPE:
ori r3,r3,87f@l
mr r4,r2 /* current */
lwz r5,_NIP(r1)
- bl printk
+ bl _printk
#endif
b ret_from_except
#ifdef CONFIG_PRINTK
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 7ed84c282233..c207b789dd22 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -446,7 +446,7 @@ SYM_FUNC_START(early_ignore_irq)
pushl 32(%esp)
pushl 40(%esp)
pushl $int_msg
- call printk
+ call _printk
call dump_stack
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 34b7e0d2346c..0ca6e28e05d6 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -309,6 +309,17 @@
#define ACPI_PROBE_TABLE(name)
#endif
+#ifdef CONFIG_PRINTK_ENUMERATION
+#define PRINTK_FMTS \
+ .printk_fmts : AT(ADDR(.printk_fmts) - LOAD_OFFSET) { \
+ __start_printk_fmts = .; \
+ *(.printk_fmts) \
+ __stop_printk_fmts = .; \
+ }
+#else
+#define PRINTK_FMTS
+#endif
+
#ifdef CONFIG_THERMAL
#define THERMAL_TABLE(name) \
. = ALIGN(8); \
@@ -480,6 +491,8 @@
\
TRACEDATA \
\
+ PRINTK_FMTS \
+ \
/* Kernel symbol table: Normal symbols */ \
__ksymtab : AT(ADDR(__ksymtab) - LOAD_OFFSET) { \
__start___ksymtab = .; \
diff --git a/include/linux/module.h b/include/linux/module.h
index 7a0bcb5b1ffc..4235b14a22ef 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -516,6 +516,11 @@ struct module {
struct klp_modinfo *klp_info;
#endif
+#ifdef CONFIG_PRINTK_ENUMERATION
+ unsigned int printk_fmts_sec_size;
+ const char **printk_fmts_start;
+#endif
+
#ifdef CONFIG_MODULE_UNLOAD
/* What modules depend on me? */
struct list_head source_list;
diff --git a/include/linux/printk.h b/include/linux/printk.h
index d3c08095a9a3..745a9915ec72 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,12 +173,12 @@ asmlinkage __printf(1, 0)
int vprintk(const char *fmt, va_list args);
asmlinkage __printf(1, 2) __cold
-int printk(const char *fmt, ...);
+int _printk(const char *fmt, ...);
/*
* Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
*/
-__printf(1, 2) __cold int printk_deferred(const char *fmt, ...);
+__printf(1, 2) __cold int _printk_deferred(const char *fmt, ...);
/*
* Please don't use printk_ratelimit(), because it shares ratelimiting state
@@ -216,12 +216,12 @@ int vprintk(const char *s, va_list args)
return 0;
}
static inline __printf(1, 2) __cold
-int printk(const char *s, ...)
+int _printk(const char *s, ...)
{
return 0;
}
static inline __printf(1, 2) __cold
-int printk_deferred(const char *s, ...)
+int _printk_deferred(const char *s, ...)
{
return 0;
}
@@ -284,6 +284,11 @@ static inline void printk_safe_flush_on_panic(void)
extern int kptr_restrict;
+#ifdef CONFIG_PRINTK_ENUMERATION
+extern const char *__start_printk_fmts[];
+extern const char *__stop_printk_fmts[];
+#endif
+
/**
* pr_fmt - used by the pr_*() macros to generate the printk format string
* @fmt: format string passed from a pr_*() macro
@@ -301,6 +306,36 @@ extern int kptr_restrict;
#define pr_fmt(fmt) fmt
#endif
+#ifdef CONFIG_PRINTK_ENUMERATION
+#define printk_store_fmt(func, fmt, ...) \
+ ({ \
+ int _printk_ret; \
+ \
+ if (__builtin_constant_p(fmt)) { \
+ /*
+ * The compiler may not be able to eliminate this
+ * branch, so we need to make sure that it doesn't see
+ * any hypothetical assignment for non-constants even
+ * though this is already inside the
+ * __builtin_constant_p guard.
+ */ \
+ static const char *_fmt __section(".printk_fmts") = \
+ __builtin_constant_p(fmt) ? fmt : NULL; \
+ _printk_ret = func(_fmt, ##__VA_ARGS__); \
+ } else \
+ _printk_ret = func(fmt, ##__VA_ARGS__); \
+ \
+ _printk_ret; \
+ })
+
+#define printk(fmt, ...) printk_store_fmt(_printk, fmt, ##__VA_ARGS__)
+#define printk_deferred(fmt, ...) \
+ printk_store_fmt(_printk_deferred, fmt, ##__VA_ARGS__)
+#else /* !CONFIG_PRINTK_ENUMERATION */
+#define printk(...) _printk(__VA_ARGS__)
+#define printk_deferred(...) _printk_deferred(__VA_ARGS__)
+#endif /* CONFIG_PRINTK_ENUMERATION */
+
/**
* pr_emerg - Print an emergency-level message
* @fmt: format string
diff --git a/init/Kconfig b/init/Kconfig
index e4e2932da237..d0839e901267 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -764,6 +764,21 @@ config PRINTK_SAFE_LOG_BUF_SHIFT
13 => 8 KB for each CPU
12 => 4 KB for each CPU
+config PRINTK_ENUMERATION
+ bool "/proc/printk_formats support"
+ depends on PRINTK
+ help
+ Add support for enumeration of all printk formats known at compile
+ time at /proc/printk_formats. This includes printk formats built into
+ the kernel, and those in loaded modules.
+
+ This can be used as part of maintaining daemons which monitor
+ /dev/kmsg, as it permits auditing the printk formats present in a
+ kernel, allowing monitoring of cases where monitored printks are
+ changed or no longer present.
+
+ There is no additional runtime cost to printk with this enabled.
+
#
# Architectures with an unreliable sched_clock() should select this:
#
diff --git a/kernel/module.c b/kernel/module.c
index 6ad424f07a4a..143f00545202 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3430,6 +3430,11 @@ static int find_module_sections(struct module *mod, struct load_info *info)
sizeof(unsigned long),
&mod->num_kprobe_blacklist);
#endif
+#ifdef CONFIG_PRINTK_ENUMERATION
+ mod->printk_fmts_start = section_objs(info, ".printk_fmts",
+ sizeof(*mod->printk_fmts_start),
+ &mod->printk_fmts_sec_size);
+#endif
#ifdef CONFIG_HAVE_STATIC_CALL_INLINE
mod->static_call_sites = section_objs(info, ".static_call_sites",
sizeof(*mod->static_call_sites),
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 6639a0cfe0ac..57142f126a1b 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -47,6 +47,7 @@
#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
#include <linux/sched/task_stack.h>
+#include <linux/proc_fs.h>
#include <linux/uaccess.h>
#include <asm/sections.h>
@@ -617,6 +618,188 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
return len;
}
+#ifdef CONFIG_PRINTK_ENUMERATION
+
+/* /proc/printk_formats - userspace enumeration of printk formats */
+
+struct printk_fmt_sec {
+ struct list_head list;
+ const struct module *module;
+ const char **start;
+ const char **end;
+};
+
+/*
+ * Stores .printk_fmt section boundaries for vmlinux and all loaded modules.
+ * Add entries with store_printk_fmt_sec, remove entries with
+ * remove_printk_fmt_sec.
+ */
+static LIST_HEAD(printk_fmts_list);
+
+/*
+ * To prevent mutation, hold printk_fmts_mutex. If mutation is acceptable, this
+ * can be read at any time without the mutex as long as it is read with
+ * protection against data tearing.
+ */
+static size_t printk_fmts_total_size;
+
+/* For printk_fmts_list and printk_fmts_total_size. */
+static DEFINE_MUTEX(printk_fmts_mutex);
+
+static void store_printk_fmt_sec(const struct module *mod, const char **start,
+ const char **end)
+{
+ struct printk_fmt_sec *ps = NULL;
+ const char **fptr = NULL;
+ size_t size = 0;
+
+ ps = kmalloc(sizeof(struct printk_fmt_sec), GFP_KERNEL);
+ if (!ps)
+ return;
+
+ ps->module = mod;
+ ps->start = start;
+ ps->end = end;
+
+ for (fptr = ps->start; fptr < ps->end; fptr++)
+ size += strlen(*fptr) + 1;
+
+ mutex_lock(&printk_fmts_mutex);
+ WRITE_ONCE(printk_fmts_total_size, printk_fmts_total_size + size);
+ list_add_tail(&ps->list, &printk_fmts_list);
+ mutex_unlock(&printk_fmts_mutex);
+}
+
+#ifdef CONFIG_MODULES
+
+static void remove_printk_fmt_sec(const struct module *mod)
+{
+ struct printk_fmt_sec *tmp = NULL, *ps = NULL;
+ const char **fptr = NULL;
+
+ if (WARN_ON_ONCE(!mod))
+ return;
+
+ mutex_lock(&printk_fmts_mutex);
+
+ list_for_each_entry_safe(ps, tmp, &printk_fmts_list, list) {
+ if (ps->module == mod) {
+ size_t new_size = printk_fmts_total_size;
+
+ for (fptr = ps->start; fptr < ps->end; fptr++)
+ new_size -= strlen(*fptr) + 1;
+
+ WRITE_ONCE(printk_fmts_total_size, new_size);
+ list_del(&ps->list);
+ kfree(ps);
+ break;
+ }
+ }
+
+ mutex_unlock(&printk_fmts_mutex);
+}
+
+static int module_printk_fmts_notify(struct notifier_block *self,
+ unsigned long val, void *data)
+{
+ const struct module *mod = data;
+
+ if (mod->printk_fmts_sec_size) {
+ switch (val) {
+ case MODULE_STATE_COMING:
+ store_printk_fmt_sec(mod, mod->printk_fmts_start,
+ mod->printk_fmts_start +
+ mod->printk_fmts_sec_size);
+ break;
+
+ case MODULE_STATE_GOING:
+ remove_printk_fmt_sec(mod);
+ break;
+ }
+ }
+
+ return NOTIFY_OK;
+}
+
+static const char *ps_get_module_name(const struct printk_fmt_sec *ps)
+{
+ return ps->module ? ps->module->name : "vmlinux";
+}
+
+static struct notifier_block module_printk_fmts_nb = {
+ .notifier_call = module_printk_fmts_notify,
+};
+
+static int __init module_printk_fmts_init(void)
+{
+ return register_module_notifier(&module_printk_fmts_nb);
+}
+
+core_initcall(module_printk_fmts_init);
+
+#else /* !CONFIG_MODULES */
+static const char *ps_get_module_name(const struct printk_fmt_sec *ps)
+{
+ return "vmlinux";
+}
+#endif /* CONFIG_MODULES */
+
+static int proc_pf_show(struct seq_file *s, void *v)
+{
+ const struct printk_fmt_sec *ps = NULL;
+ const char **fptr = NULL;
+
+ mutex_lock(&printk_fmts_mutex);
+
+ list_for_each_entry(ps, &printk_fmts_list, list) {
+ const char *mod_name = ps_get_module_name(ps);
+
+ for (fptr = ps->start; fptr < ps->end; fptr++) {
+ seq_puts(s, mod_name);
+ seq_putc(s, ',');
+ seq_puts(s, *fptr);
+ seq_putc(s, '\0');
+ }
+ }
+
+ mutex_unlock(&printk_fmts_mutex);
+
+ return 0;
+}
+
+static int proc_pf_open(struct inode *inode, struct file *file)
+{
+ /*
+ * We don't need to hold the mutex here to ensure that
+ * printk_fmts_total_size doesn't change prior to iteration -- worst
+ * case, seq_read_iter() will reallocate.
+ */
+ return single_open_size(file, proc_pf_show, NULL,
+ READ_ONCE(printk_fmts_total_size));
+}
+
+static const struct proc_ops printk_proc_ops = {
+ .proc_flags = PROC_ENTRY_PERMANENT,
+ .proc_open = proc_pf_open,
+ .proc_read_iter = seq_read_iter,
+ .proc_lseek = seq_lseek,
+ .proc_release = single_release,
+};
+
+static int __init init_printk_fmts(void)
+{
+ const struct proc_dir_entry *pd = NULL;
+
+ store_printk_fmt_sec(NULL, __start_printk_fmts, __stop_printk_fmts);
+ pd = proc_create("printk_formats", 0, NULL, &printk_proc_ops);
+
+ return pd ? 0 : -ENOMEM;
+}
+
+core_initcall(init_printk_fmts);
+
+#endif /* CONFIG_PRINTK_ENUMERATION */
+
/* /dev/kmsg - userspace message inject/listen interface */
struct devkmsg_user {
u64 seq;
@@ -2111,10 +2294,13 @@ int vprintk_default(const char *fmt, va_list args)
EXPORT_SYMBOL_GPL(vprintk_default);
/**
- * printk - print a kernel message
+ * _printk - print a kernel message
* @fmt: format string
*
- * This is printk(). It can be called from any context. We want it to work.
+ * This is _printk(). It can be called from any context. We want it to work.
+ *
+ * If printk enumeration is enabled, _printk() is called from printk_store_fmt.
+ * Otherwise, printk is simply #defined to _printk.
*
* We try to grab the console_lock. If we succeed, it's easy - we log the
* output and call the console drivers. If we fail to get the semaphore, we
@@ -2131,7 +2317,7 @@ EXPORT_SYMBOL_GPL(vprintk_default);
*
* See the vsnprintf() documentation for format string extensions over C99.
*/
-asmlinkage __visible int printk(const char *fmt, ...)
+asmlinkage __visible int _printk(const char *fmt, ...)
{
va_list args;
int r;
@@ -2142,7 +2328,7 @@ asmlinkage __visible int printk(const char *fmt, ...)
return r;
}
-EXPORT_SYMBOL(printk);
+EXPORT_SYMBOL(_printk);
#else /* CONFIG_PRINTK */
@@ -3133,7 +3319,7 @@ int vprintk_deferred(const char *fmt, va_list args)
return r;
}
-int printk_deferred(const char *fmt, ...)
+int _printk_deferred(const char *fmt, ...)
{
va_list args;
int r;
--
2.30.0
Some improvements I noticed that I'll include when sending v2.
Chris Down writes:
>+static void remove_printk_fmt_sec(const struct module *mod)
>+{
>+ struct printk_fmt_sec *tmp = NULL, *ps = NULL;
>+ const char **fptr = NULL;
fptr can be scoped to list_for_each_entry_safe, I'll move it.
>+static int proc_pf_show(struct seq_file *s, void *v)
>+{
>+ const struct printk_fmt_sec *ps = NULL;
>+ const char **fptr = NULL;
Ditto for list_for_each_entry.
>+
>+ mutex_lock(&printk_fmts_mutex);
>+
>+ list_for_each_entry(ps, &printk_fmts_list, list) {
>+ const char *mod_name = ps_get_module_name(ps);
>+
>+ for (fptr = ps->start; fptr < ps->end; fptr++) {
>+ seq_puts(s, mod_name);
>+ seq_putc(s, ',');
>+ seq_puts(s, *fptr);
>+ seq_putc(s, '\0');
>+ }
>+ }
Not sure why I didn't just use seq_printf here. I'll do that.
I'll wait a few days for feedback then will send v2 either way. :-)
Hi,
I would like to hear opinion from a bigger audience. It is an
userspace interface that we might need to maintain forewer.
Adding few more people in to CC:
Steven Rostedt <[email protected]>: printk co-maintainer
Alexey Dobriyan <[email protected]>: fs/proc maintainer
Greg Kroah-Hartman <[email protected]>: sysfs maintainer
Jason Baron <[email protected]>: dynamic_debug maintainer
Kees Cook <[email protected]>: security POV
[email protected]: Linux API mailing list
Of course, we should also ask if this is the right approach
for the think that you want to achieve.
The motivation for this patch is that the strings printed by kernels
are not reliable and you want a simple way to compare differences
bethween versions. Do I get it right?
See more comments below.
On Thu 2021-02-04 15:37:52, Chris Down wrote:
> We have a number of systems industry-wide that have a subset of their
> functionality that works as follows:
> 1. Receive a message from local kmsg, serial console, or netconsole;
> 2. Apply a set of rules to classify the message;
> 3. Do something based on this classification (like scheduling a
> remediation for the machine), rinse, and repeat.
>
> As a couple of examples of places we have this implemented just inside
> Facebook, although this isn't a Facebook-specific problem, we have this
> inside our netconsole processing (for alarm classification), and as part
> of our machine health checking. We use these messages to determine
> fairly important metrics around production health, and it's important
> that we get them right.
>
> While for some kinds of issues we have counters, tracepoints, or metrics
> with a stable interface which can reliably indicate the issue, in order
> to react to production issues quickly we need to work with the interface
> which most kernel developers naturally use when developing: printk.
>
> Most production issues come from unexpected phenomena, and as such
> usually the code in question doesn't have easily usable tracepoints or
> other counters available for the specific problem being mitigated. We
> have a number of lines of monitoring defence against problems in
> production (host metrics, process metrics, service metrics, etc), and
> where it's not feasible to reliably monitor at another level, this kind
> of pragmatic netconsole monitoring is essential.
>
> As you'd expect, monitoring using printk is rather brittle for a number
> of reasons -- most notably that the message might disappear entirely in
> a new version of the kernel, or that the message may change in some way
> that the regex or other classification methods start to silently fail.
Another is that printk() is not reliable on its own. Messages might
get lost. The size of the log buffer is limited. Deamon reading
/dev/kmsg need not be scheduled in time or often enough. Console
might be slow. The messages are filtered on the console by console_loglevel.
> One factor that makes this even harder is that, under normal operation,
> many of these messages are never expected to be hit. For example, there
> may be some rare hardware bug which you want to detect if it was to ever
> happen again, but its recurrence is not likely or anticipated. This
> precludes using something like checking whether the printk in question
> was printed somewhere fleetwide recently to determine whether the
> message in question is still present or not, since we don't anticipate
> that it should be printed anywhere, but still need to monitor for its
> future presence in the long-term.
>
> This class of issue has happened on a number of occasions, causing
> unhealthy machines with hardware issues to remain in production for
> longer than ideal. As a recent example, some monitoring around
> blk_update_request fell out of date and caused semi-broken machines to
> remain in production for longer than would be desirable.
>
> Searching through the codebase to find the message is also extremely
> fragile, because many of the messages are further constructed beyond
> their callsite (eg. btrfs_printk and other module-specific wrappers,
> each with their own functionality). Even if they aren't, guessing the
> format and formulation of the underlying message based on the aesthetics
> of the message emitted is not a recipe for success at scale, and our
> previous issues with fleetwide machine health checking demonstrate as
> much.
>
> This patch provides a solution to the issue of silently changed or
> deleted printks: we record pointers to all printk format strings known
> at compile time into a new .printk_fmts section, both in vmlinux and
> modules. At runtime, this can then be iterated by looking at
> /proc/printk_formats, which emits the same format as `printk` itself,
> which we already export elsewhere (for example, in netconsole):
>
> # Format: <module>,<facility><level><format>\0
> $ perl -p -e 's/\n/\\n/g;s/\0/\n/g' /proc/printk_formats | shuf -n 5
> vmlinux,6Disabling APIC timer\n
> intel_rapl_common,3intel_rapl_common: Cannot find matching power limit for constraint %d\n
> dm_crypt,3device-mapper: crypt: %s: INTEGRITY AEAD ERROR, sector %llu\n
> mac80211,6%s: AP bug: HT capability missing from AssocResp\n
> vmlinux,3zpool: couldn't create zpool - out of memory\n
The facility and log level are not well separated from the format string.
Also this is yet another style how the format is displayed. We already have
+ console/syslog: formated by record_print_text()
+ /dev/kmsg: formatted by info_print_ext_header(), msg_print_ext_body().
+ /sys/kernel/debug/dynamic_debug/control
+ /sys/kernel/debug/tracing/printk_formats
We should get some inspiration from the existing interfaces.
But we first should decide what information might be useful:
+ 'facility' should not be needed. All messages should be from
kernel.
+ <module> is already optinaly added by pr_fmt() to the printed strings
as: pr_fmt(): ...
+ dynamic_debug seems to print KBUILD_MODNAME even when the module
is built in.
+ dynamic debug also prints <source_file:line>
> This mitigates the majority of cases where we have a highly-specific
> printk which we want to match on, as we can now enumerate and check
> whether the format changed or the printk callsite disappeared entirely
> in userspace. This allows us to catch changes to printks we monitor
> earlier and decide what to do about it before it becomes problematic.
>
> There is no additional runtime cost for printk callers or printk itself,
> and the assembly generated is exactly the same.
>
> Signed-off-by: Chris Down <[email protected]>
> Cc: Petr Mladek <[email protected]>
> Cc: Sergey Senozhatsky <[email protected]>
> Cc: John Ogness <[email protected]>
> Cc: Johannes Weiner <[email protected]>
> Cc: Andrew Morton <[email protected]>
> ---
> arch/arm/kernel/entry-v7m.S | 2 +-
> arch/arm/lib/backtrace-clang.S | 2 +-
> arch/arm/lib/backtrace.S | 2 +-
> arch/arm/mach-rpc/io-acorn.S | 2 +-
> arch/arm/vfp/vfphw.S | 6 +-
> arch/openrisc/kernel/entry.S | 6 +-
> arch/powerpc/kernel/head_fsl_booke.S | 2 +-
> arch/x86/kernel/head_32.S | 2 +-
> include/asm-generic/vmlinux.lds.h | 13 ++
> include/linux/module.h | 5 +
> include/linux/printk.h | 43 +++++-
> init/Kconfig | 15 ++
> kernel/module.c | 5 +
> kernel/printk/printk.c | 196 ++++++++++++++++++++++++++-
> 14 files changed, 280 insertions(+), 21 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
> index d0e898608d30..7bde93c10962 100644
> --- a/arch/arm/kernel/entry-v7m.S
> +++ b/arch/arm/kernel/entry-v7m.S
> @@ -23,7 +23,7 @@ __invalid_entry:
> adr r0, strerr
> mrs r1, ipsr
> mov r2, lr
> - bl printk
> + bl _printk
> #endif
> mov r0, sp
> bl show_regs
> diff --git a/arch/arm/lib/backtrace-clang.S b/arch/arm/lib/backtrace-clang.S
> index 6174c45f53a5..5b2cdb1003e3 100644
> --- a/arch/arm/lib/backtrace-clang.S
> +++ b/arch/arm/lib/backtrace-clang.S
> @@ -202,7 +202,7 @@ finished_setup:
> 1006: adr r0, .Lbad
> mov r1, loglvl
> mov r2, frame
> - bl printk
> + bl _printk
> no_frame: ldmfd sp!, {r4 - r9, fp, pc}
> ENDPROC(c_backtrace)
> .pushsection __ex_table,"a"
> diff --git a/arch/arm/lib/backtrace.S b/arch/arm/lib/backtrace.S
> index 872f658638d9..e8408f22d4dc 100644
> --- a/arch/arm/lib/backtrace.S
> +++ b/arch/arm/lib/backtrace.S
> @@ -103,7 +103,7 @@ for_each_frame: tst frame, mask @ Check for address exceptions
> 1006: adr r0, .Lbad
> mov r1, loglvl
> mov r2, frame
> - bl printk
> + bl _printk
> no_frame: ldmfd sp!, {r4 - r9, pc}
> ENDPROC(c_backtrace)
>
> diff --git a/arch/arm/mach-rpc/io-acorn.S b/arch/arm/mach-rpc/io-acorn.S
> index b9082a2a2a01..aa9bf0d771c0 100644
> --- a/arch/arm/mach-rpc/io-acorn.S
> +++ b/arch/arm/mach-rpc/io-acorn.S
> @@ -25,4 +25,4 @@ ENTRY(insl)
> ENTRY(outsl)
> adr r0, .Liosl_warning
> mov r1, lr
> - b printk
> + b _printk
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index d5837bf05a9a..6f7926c9c179 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -23,7 +23,7 @@
> #ifdef DEBUG
> stmfd sp!, {r0-r3, ip, lr}
> ldr r0, =1f
> - bl printk
> + bl _printk
> ldmfd sp!, {r0-r3, ip, lr}
>
> .pushsection .rodata, "a"
> @@ -38,7 +38,7 @@
> stmfd sp!, {r0-r3, ip, lr}
> mov r1, \arg
> ldr r0, =1f
> - bl printk
> + bl _printk
> ldmfd sp!, {r0-r3, ip, lr}
>
> .pushsection .rodata, "a"
> @@ -55,7 +55,7 @@
> mov r2, \arg2
> mov r1, \arg1
> ldr r0, =1f
> - bl printk
> + bl _printk
> ldmfd sp!, {r0-r3, ip, lr}
>
> .pushsection .rodata, "a"
> diff --git a/arch/openrisc/kernel/entry.S b/arch/openrisc/kernel/entry.S
> index bc657e55c15f..947613f61d4a 100644
> --- a/arch/openrisc/kernel/entry.S
> +++ b/arch/openrisc/kernel/entry.S
> @@ -551,7 +551,7 @@ EXCEPTION_ENTRY(_external_irq_handler)
> l.movhi r3,hi(42f)
> l.ori r3,r3,lo(42f)
> l.sw 0x0(r1),r3
> - l.jal printk
> + l.jal _printk
> l.sw 0x4(r1),r4
> l.addi r1,r1,0x8
>
> @@ -681,8 +681,8 @@ _syscall_debug:
> l.sw -4(r1),r27
> l.sw -8(r1),r11
> l.addi r1,r1,-8
> - l.movhi r27,hi(printk)
> - l.ori r27,r27,lo(printk)
> + l.movhi r27,hi(_printk)
> + l.ori r27,r27,lo(_printk)
> l.jalr r27
> l.nop
> l.addi r1,r1,8
> diff --git a/arch/powerpc/kernel/head_fsl_booke.S b/arch/powerpc/kernel/head_fsl_booke.S
> index fdd4d274c245..f2c7ecef9b14 100644
> --- a/arch/powerpc/kernel/head_fsl_booke.S
> +++ b/arch/powerpc/kernel/head_fsl_booke.S
> @@ -852,7 +852,7 @@ KernelSPE:
> ori r3,r3,87f@l
> mr r4,r2 /* current */
> lwz r5,_NIP(r1)
> - bl printk
> + bl _printk
> #endif
> b ret_from_except
> #ifdef CONFIG_PRINTK
> diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> index 7ed84c282233..c207b789dd22 100644
> --- a/arch/x86/kernel/head_32.S
> +++ b/arch/x86/kernel/head_32.S
> @@ -446,7 +446,7 @@ SYM_FUNC_START(early_ignore_irq)
> pushl 32(%esp)
> pushl 40(%esp)
> pushl $int_msg
> - call printk
> + call _printk
>
> call dump_stack
>
> diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> index 34b7e0d2346c..0ca6e28e05d6 100644
> --- a/include/asm-generic/vmlinux.lds.h
> +++ b/include/asm-generic/vmlinux.lds.h
> @@ -309,6 +309,17 @@
> #define ACPI_PROBE_TABLE(name)
> #endif
>
> +#ifdef CONFIG_PRINTK_ENUMERATION
> +#define PRINTK_FMTS \
> + .printk_fmts : AT(ADDR(.printk_fmts) - LOAD_OFFSET) { \
> + __start_printk_fmts = .; \
> + *(.printk_fmts) \
> + __stop_printk_fmts = .; \
> + }
> +#else
> +#define PRINTK_FMTS
> +#endif
It should be defined after #define TRACEDATA to follow the existing
style.
But honestly I am not much familiar with the sections definitions.
I am curious why TRACE_PRINTKS() and __dyndbg are defined
a bit different way.
> +
> #ifdef CONFIG_THERMAL
> #define THERMAL_TABLE(name) \
> . = ALIGN(8); \
> @@ -480,6 +491,8 @@
> \
> TRACEDATA \
> \
> + PRINTK_FMTS \
> + \
> /* Kernel symbol table: Normal symbols */ \
> __ksymtab : AT(ADDR(__ksymtab) - LOAD_OFFSET) { \
> __start___ksymtab = .; \
> diff --git a/include/linux/module.h b/include/linux/module.h
> index 7a0bcb5b1ffc..4235b14a22ef 100644
> --- a/include/linux/module.h
> +++ b/include/linux/module.h
> @@ -516,6 +516,11 @@ struct module {
> struct klp_modinfo *klp_info;
> #endif
>
> +#ifdef CONFIG_PRINTK_ENUMERATION
> + unsigned int printk_fmts_sec_size;
> + const char **printk_fmts_start;
> +#endif
> +
> #ifdef CONFIG_MODULE_UNLOAD
> /* What modules depend on me? */
> struct list_head source_list;
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index d3c08095a9a3..745a9915ec72 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -173,12 +173,12 @@ asmlinkage __printf(1, 0)
> int vprintk(const char *fmt, va_list args);
>
> asmlinkage __printf(1, 2) __cold
> -int printk(const char *fmt, ...);
> +int _printk(const char *fmt, ...);
>
> /*
> * Special printk facility for scheduler/timekeeping use only, _DO_NOT_USE_ !
> */
> -__printf(1, 2) __cold int printk_deferred(const char *fmt, ...);
> +__printf(1, 2) __cold int _printk_deferred(const char *fmt, ...);
>
> /*
> * Please don't use printk_ratelimit(), because it shares ratelimiting state
> @@ -216,12 +216,12 @@ int vprintk(const char *s, va_list args)
> return 0;
> }
> static inline __printf(1, 2) __cold
> -int printk(const char *s, ...)
> +int _printk(const char *s, ...)
> {
> return 0;
> }
> static inline __printf(1, 2) __cold
> -int printk_deferred(const char *s, ...)
> +int _printk_deferred(const char *s, ...)
> {
> return 0;
> }
> @@ -284,6 +284,11 @@ static inline void printk_safe_flush_on_panic(void)
>
> extern int kptr_restrict;
>
> +#ifdef CONFIG_PRINTK_ENUMERATION
> +extern const char *__start_printk_fmts[];
> +extern const char *__stop_printk_fmts[];
> +#endif
> +
> /**
> * pr_fmt - used by the pr_*() macros to generate the printk format string
> * @fmt: format string passed from a pr_*() macro
> @@ -301,6 +306,36 @@ extern int kptr_restrict;
> #define pr_fmt(fmt) fmt
> #endif
>
> +#ifdef CONFIG_PRINTK_ENUMERATION
> +#define printk_store_fmt(func, fmt, ...) \
> + ({ \
> + int _printk_ret; \
> + \
> + if (__builtin_constant_p(fmt)) { \
> + /*
> + * The compiler may not be able to eliminate this
> + * branch, so we need to make sure that it doesn't see
> + * any hypothetical assignment for non-constants even
> + * though this is already inside the
> + * __builtin_constant_p guard.
> + */ \
> + static const char *_fmt __section(".printk_fmts") = \
> + __builtin_constant_p(fmt) ? fmt : NULL; \
> + _printk_ret = func(_fmt, ##__VA_ARGS__); \
> + } else \
> + _printk_ret = func(fmt, ##__VA_ARGS__); \
> + \
> + _printk_ret; \
> + })
> +
> +#define printk(fmt, ...) printk_store_fmt(_printk, fmt, ##__VA_ARGS__)
> +#define printk_deferred(fmt, ...) \
> + printk_store_fmt(_printk_deferred, fmt, ##__VA_ARGS__)
> +#else /* !CONFIG_PRINTK_ENUMERATION */
> +#define printk(...) _printk(__VA_ARGS__)
> +#define printk_deferred(...) _printk_deferred(__VA_ARGS__)
> +#endif /* CONFIG_PRINTK_ENUMERATION */
> +
> /**
> * pr_emerg - Print an emergency-level message
> * @fmt: format string
> diff --git a/init/Kconfig b/init/Kconfig
> index e4e2932da237..d0839e901267 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -764,6 +764,21 @@ config PRINTK_SAFE_LOG_BUF_SHIFT
> 13 => 8 KB for each CPU
> 12 => 4 KB for each CPU
>
> +config PRINTK_ENUMERATION
> + bool "/proc/printk_formats support"
> + depends on PRINTK
> + help
> + Add support for enumeration of all printk formats known at compile
> + time at /proc/printk_formats. This includes printk formats built into
> + the kernel, and those in loaded modules.
> +
> + This can be used as part of maintaining daemons which monitor
> + /dev/kmsg, as it permits auditing the printk formats present in a
> + kernel, allowing monitoring of cases where monitored printks are
> + changed or no longer present.
> +
> + There is no additional runtime cost to printk with this enabled.
> +
> #
> # Architectures with an unreliable sched_clock() should select this:
> #
> diff --git a/kernel/module.c b/kernel/module.c
> index 6ad424f07a4a..143f00545202 100644
> --- a/kernel/module.c
> +++ b/kernel/module.c
> @@ -3430,6 +3430,11 @@ static int find_module_sections(struct module *mod, struct load_info *info)
> sizeof(unsigned long),
> &mod->num_kprobe_blacklist);
> #endif
> +#ifdef CONFIG_PRINTK_ENUMERATION
> + mod->printk_fmts_start = section_objs(info, ".printk_fmts",
> + sizeof(*mod->printk_fmts_start),
> + &mod->printk_fmts_sec_size);
> +#endif
> #ifdef CONFIG_HAVE_STATIC_CALL_INLINE
> mod->static_call_sites = section_objs(info, ".static_call_sites",
> sizeof(*mod->static_call_sites),
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 6639a0cfe0ac..57142f126a1b 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -47,6 +47,7 @@
> #include <linux/sched/clock.h>
> #include <linux/sched/debug.h>
> #include <linux/sched/task_stack.h>
> +#include <linux/proc_fs.h>
>
> #include <linux/uaccess.h>
> #include <asm/sections.h>
> @@ -617,6 +618,188 @@ static ssize_t msg_print_ext_body(char *buf, size_t size,
> return len;
> }
>
> +#ifdef CONFIG_PRINTK_ENUMERATION
> +
> +/* /proc/printk_formats - userspace enumeration of printk formats */
> +
> +struct printk_fmt_sec {
> + struct list_head list;
> + const struct module *module;
> + const char **start;
> + const char **end;
> +};
> +
> +/*
> + * Stores .printk_fmt section boundaries for vmlinux and all loaded modules.
> + * Add entries with store_printk_fmt_sec, remove entries with
> + * remove_printk_fmt_sec.
> + */
> +static LIST_HEAD(printk_fmts_list);
> +
> +/*
> + * To prevent mutation, hold printk_fmts_mutex. If mutation is acceptable, this
> + * can be read at any time without the mutex as long as it is read with
> + * protection against data tearing.
> + */
> +static size_t printk_fmts_total_size;
> +
> +/* For printk_fmts_list and printk_fmts_total_size. */
> +static DEFINE_MUTEX(printk_fmts_mutex);
> +
> +static void store_printk_fmt_sec(const struct module *mod, const char **start,
> + const char **end)
> +{
> + struct printk_fmt_sec *ps = NULL;
> + const char **fptr = NULL;
> + size_t size = 0;
> +
> + ps = kmalloc(sizeof(struct printk_fmt_sec), GFP_KERNEL);
> + if (!ps)
> + return;
> +
> + ps->module = mod;
> + ps->start = start;
> + ps->end = end;
> +
> + for (fptr = ps->start; fptr < ps->end; fptr++)
> + size += strlen(*fptr) + 1;
> +
> + mutex_lock(&printk_fmts_mutex);
> + WRITE_ONCE(printk_fmts_total_size, printk_fmts_total_size + size);
> + list_add_tail(&ps->list, &printk_fmts_list);
> + mutex_unlock(&printk_fmts_mutex);
> +}
> +
> +#ifdef CONFIG_MODULES
> +
> +static void remove_printk_fmt_sec(const struct module *mod)
> +{
> + struct printk_fmt_sec *tmp = NULL, *ps = NULL;
> + const char **fptr = NULL;
> +
> + if (WARN_ON_ONCE(!mod))
> + return;
> +
> + mutex_lock(&printk_fmts_mutex);
> +
> + list_for_each_entry_safe(ps, tmp, &printk_fmts_list, list) {
> + if (ps->module == mod) {
> + size_t new_size = printk_fmts_total_size;
> +
> + for (fptr = ps->start; fptr < ps->end; fptr++)
> + new_size -= strlen(*fptr) + 1;
> +
> + WRITE_ONCE(printk_fmts_total_size, new_size);
> + list_del(&ps->list);
> + kfree(ps);
> + break;
> + }
> + }
> +
> + mutex_unlock(&printk_fmts_mutex);
> +}
> +
> +static int module_printk_fmts_notify(struct notifier_block *self,
> + unsigned long val, void *data)
> +{
> + const struct module *mod = data;
> +
> + if (mod->printk_fmts_sec_size) {
> + switch (val) {
> + case MODULE_STATE_COMING:
> + store_printk_fmt_sec(mod, mod->printk_fmts_start,
> + mod->printk_fmts_start +
> + mod->printk_fmts_sec_size);
> + break;
> +
> + case MODULE_STATE_GOING:
> + remove_printk_fmt_sec(mod);
> + break;
> + }
> + }
> +
> + return NOTIFY_OK;
> +}
> +
> +static const char *ps_get_module_name(const struct printk_fmt_sec *ps)
> +{
> + return ps->module ? ps->module->name : "vmlinux";
> +}
> +
> +static struct notifier_block module_printk_fmts_nb = {
> + .notifier_call = module_printk_fmts_notify,
> +};
> +
> +static int __init module_printk_fmts_init(void)
> +{
> + return register_module_notifier(&module_printk_fmts_nb);
> +}
> +
> +core_initcall(module_printk_fmts_init);
> +
> +#else /* !CONFIG_MODULES */
> +static const char *ps_get_module_name(const struct printk_fmt_sec *ps)
> +{
> + return "vmlinux";
> +}
> +#endif /* CONFIG_MODULES */
> +
> +static int proc_pf_show(struct seq_file *s, void *v)
> +{
> + const struct printk_fmt_sec *ps = NULL;
> + const char **fptr = NULL;
> +
> + mutex_lock(&printk_fmts_mutex);
> +
> + list_for_each_entry(ps, &printk_fmts_list, list) {
> + const char *mod_name = ps_get_module_name(ps);
> +
> + for (fptr = ps->start; fptr < ps->end; fptr++) {
> + seq_puts(s, mod_name);
> + seq_putc(s, ',');
> + seq_puts(s, *fptr);
> + seq_putc(s, '\0');
> + }
You probably should get inspiration from t_show() in trace_printk.c.
It handles newlines, ...
Or by ddebug_proc_show(). It uses seq_escape().
Anyway, there is something wrong at the moment. The output looks fine
with cat. But "less" says that it is a binary format and the output
is a bit messy:
$> less /proc/printk_formats
"/proc/printk_formats" may be a binary file. See it anyway?
vmlinux,^A3Warning: unable to open an initial console.
^@vmlinux,^A3Failed to execute %s (error %d)
^@vmlinux,^A6Kernel memory protection disabled.
^@vmlinux,^A3Starting init: %s exists but couldn't execute it (error %d)
That is for now. I still have to think about it. And I am also curious
about what others thing about this idea.
Best Regards,
Petr
> + }
> +
> + mutex_unlock(&printk_fmts_mutex);
> +
> + return 0;
> +}
> +
> +static int proc_pf_open(struct inode *inode, struct file *file)
> +{
> + /*
> + * We don't need to hold the mutex here to ensure that
> + * printk_fmts_total_size doesn't change prior to iteration -- worst
> + * case, seq_read_iter() will reallocate.
> + */
> + return single_open_size(file, proc_pf_show, NULL,
> + READ_ONCE(printk_fmts_total_size));
> +}
> +
> +static const struct proc_ops printk_proc_ops = {
> + .proc_flags = PROC_ENTRY_PERMANENT,
> + .proc_open = proc_pf_open,
> + .proc_read_iter = seq_read_iter,
> + .proc_lseek = seq_lseek,
> + .proc_release = single_release,
> +};
> +
> +static int __init init_printk_fmts(void)
> +{
> + const struct proc_dir_entry *pd = NULL;
> +
> + store_printk_fmt_sec(NULL, __start_printk_fmts, __stop_printk_fmts);
> + pd = proc_create("printk_formats", 0, NULL, &printk_proc_ops);
> +
> + return pd ? 0 : -ENOMEM;
> +}
> +
> +core_initcall(init_printk_fmts);
> +
> +#endif /* CONFIG_PRINTK_ENUMERATION */
> +
> /* /dev/kmsg - userspace message inject/listen interface */
> struct devkmsg_user {
> u64 seq;
> @@ -2111,10 +2294,13 @@ int vprintk_default(const char *fmt, va_list args)
> EXPORT_SYMBOL_GPL(vprintk_default);
>
> /**
> - * printk - print a kernel message
> + * _printk - print a kernel message
> * @fmt: format string
> *
> - * This is printk(). It can be called from any context. We want it to work.
> + * This is _printk(). It can be called from any context. We want it to work.
> + *
> + * If printk enumeration is enabled, _printk() is called from printk_store_fmt.
> + * Otherwise, printk is simply #defined to _printk.
> *
> * We try to grab the console_lock. If we succeed, it's easy - we log the
> * output and call the console drivers. If we fail to get the semaphore, we
> @@ -2131,7 +2317,7 @@ EXPORT_SYMBOL_GPL(vprintk_default);
> *
> * See the vsnprintf() documentation for format string extensions over C99.
> */
> -asmlinkage __visible int printk(const char *fmt, ...)
> +asmlinkage __visible int _printk(const char *fmt, ...)
> {
> va_list args;
> int r;
> @@ -2142,7 +2328,7 @@ asmlinkage __visible int printk(const char *fmt, ...)
>
> return r;
> }
> -EXPORT_SYMBOL(printk);
> +EXPORT_SYMBOL(_printk);
>
> #else /* CONFIG_PRINTK */
>
> @@ -3133,7 +3319,7 @@ int vprintk_deferred(const char *fmt, va_list args)
> return r;
> }
>
> -int printk_deferred(const char *fmt, ...)
> +int _printk_deferred(const char *fmt, ...)
> {
> va_list args;
> int r;
> --
> 2.30.0
On Fri, 5 Feb 2021 17:42:55 +0100
Petr Mladek <[email protected]> wrote:
> Hi,
>
> I would like to hear opinion from a bigger audience. It is an
> userspace interface that we might need to maintain forewer.
> Adding few more people in to CC:
>
> Steven Rostedt <[email protected]>: printk co-maintainer
Thanks for Cc'ing me.
> Alexey Dobriyan <[email protected]>: fs/proc maintainer
> Greg Kroah-Hartman <[email protected]>: sysfs maintainer
> Jason Baron <[email protected]>: dynamic_debug maintainer
> Kees Cook <[email protected]>: security POV
> [email protected]: Linux API mailing list
>
> Of course, we should also ask if this is the right approach
> for the think that you want to achieve.
>
> The motivation for this patch is that the strings printed by kernels
> are not reliable and you want a simple way to compare differences
> bethween versions. Do I get it right?
>
> See more comments below.
>
>
> Also this is yet another style how the format is displayed. We already have
>
> + console/syslog: formated by record_print_text()
> + /dev/kmsg: formatted by info_print_ext_header(), msg_print_ext_body().
> + /sys/kernel/debug/dynamic_debug/control
> + /sys/kernel/debug/tracing/printk_formats
>
> We should get some inspiration from the existing interfaces.
Interesting, because when I was looking at the original patch (looked at
the lore link before reading your reply), I thought to myself "this looks
exactly like what I did for trace_printk formats", which the above file is
where it is shown. I'm curious if this work was inspired by that?
> > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
> > index 34b7e0d2346c..0ca6e28e05d6 100644
> > --- a/include/asm-generic/vmlinux.lds.h
> > +++ b/include/asm-generic/vmlinux.lds.h
> > @@ -309,6 +309,17 @@
> > #define ACPI_PROBE_TABLE(name)
> > #endif
> >
> > +#ifdef CONFIG_PRINTK_ENUMERATION
> > +#define PRINTK_FMTS \
> > + .printk_fmts : AT(ADDR(.printk_fmts) - LOAD_OFFSET) { \
> > + __start_printk_fmts = .; \
> > + *(.printk_fmts) \
> > + __stop_printk_fmts = .; \
> > + }
> > +#else
> > +#define PRINTK_FMTS
> > +#endif
>
> It should be defined after #define TRACEDATA to follow the existing
> style.
>
> But honestly I am not much familiar with the sections definitions.
> I am curious why TRACE_PRINTKS() and __dyndbg are defined
> a bit different way.
>
I'm not sure what difference you mean.
> > +static int proc_pf_show(struct seq_file *s, void *v)
> > +{
> > + const struct printk_fmt_sec *ps = NULL;
> > + const char **fptr = NULL;
> > +
> > + mutex_lock(&printk_fmts_mutex);
> > +
> > + list_for_each_entry(ps, &printk_fmts_list, list) {
> > + const char *mod_name = ps_get_module_name(ps);
> > +
> > + for (fptr = ps->start; fptr < ps->end; fptr++) {
> > + seq_puts(s, mod_name);
> > + seq_putc(s, ',');
> > + seq_puts(s, *fptr);
> > + seq_putc(s, '\0');
> > + }
>
> You probably should get inspiration from t_show() in trace_printk.c.
> It handles newlines, ...
>
> Or by ddebug_proc_show(). It uses seq_escape().
>
> Anyway, there is something wrong at the moment. The output looks fine
> with cat. But "less" says that it is a binary format and the output
> is a bit messy:
Hmm, that's usually the case when lseek gets messed up. Not sure how that
happened.
>
> $> less /proc/printk_formats
> "/proc/printk_formats" may be a binary file. See it anyway?
> vmlinux,^A3Warning: unable to open an initial console.
> ^@vmlinux,^A3Failed to execute %s (error %d)
> ^@vmlinux,^A6Kernel memory protection disabled.
> ^@vmlinux,^A3Starting init: %s exists but couldn't execute it (error %d)
>
>
> That is for now. I still have to think about it. And I am also curious
> about what others thing about this idea.
>
I'm not against the idea. I don't think it belongs in /proc. Perhaps
debugfs is a better place to put it.
-- Steve
Oh, and one more I just spotted and will fix in v2 after other feedback is in:
Chris Down writes:
>+static void store_printk_fmt_sec(const struct module *mod, const char **start,
>+ const char **end)
>+{
>+ struct printk_fmt_sec *ps = NULL;
>+ const char **fptr = NULL;
>+ size_t size = 0;
>+
>+ ps = kmalloc(sizeof(struct printk_fmt_sec), GFP_KERNEL);
>+ if (!ps)
>+ return;
>+
>+ ps->module = mod;
>+ ps->start = start;
>+ ps->end = end;
>+
>+ for (fptr = ps->start; fptr < ps->end; fptr++)
>+ size += strlen(*fptr) + 1;
This still works, but is out of date and doesn't account for the module and
comma, so results in one more unnecessary round trip doing seq_buf_alloc.
Hi Steven,
Steven Rostedt writes:
>Interesting, because when I was looking at the original patch (looked at
>the lore link before reading your reply), I thought to myself "this looks
>exactly like what I did for trace_printk formats", which the above file is
>where it is shown. I'm curious if this work was inspired by that?
The double __builtin_constant_p() trick was suggested by Johannes based on
prior art in trace_puts() just prior to patch submission. Other than that, it
seems we came up with basically the same solution independently. :-)
>> Anyway, there is something wrong at the moment. The output looks fine
>> with cat. But "less" says that it is a binary format and the output
>> is a bit messy:
>
>Hmm, that's usually the case when lseek gets messed up. Not sure how that
>happened.
It looks as intended to me -- none of the newlines, nulls, or other control
sequences are escaped currently, since I didn't immediately see a reason to do
that. If that's a blocker though, I'm happy to change it.
>> $> less /proc/printk_formats
>> "/proc/printk_formats" may be a binary file. See it anyway?
>> vmlinux,^A3Warning: unable to open an initial console.
>> ^@vmlinux,^A3Failed to execute %s (error %d)
>> ^@vmlinux,^A6Kernel memory protection disabled.
>> ^@vmlinux,^A3Starting init: %s exists but couldn't execute it (error %d)
>>
>>
>> That is for now. I still have to think about it. And I am also curious
>> about what others thing about this idea.
>>
>
>I'm not against the idea. I don't think it belongs in /proc. Perhaps
>debugfs is a better place to put it.
Any location is fine with me, as long as it gets to userspace. How does
<debugfs>/printk/formats or <debugfs>/printk/formats/<module> sound to you?
Thanks,
Chris
Hi Petr,
Thanks for looking over the patch. :-)
Petr Mladek writes:
>> Most production issues come from unexpected phenomena, and as such
>> usually the code in question doesn't have easily usable tracepoints or
>> other counters available for the specific problem being mitigated. We
>> have a number of lines of monitoring defence against problems in
>> production (host metrics, process metrics, service metrics, etc), and
>> where it's not feasible to reliably monitor at another level, this kind
>> of pragmatic netconsole monitoring is essential.
>>
>> As you'd expect, monitoring using printk is rather brittle for a number
>> of reasons -- most notably that the message might disappear entirely in
>> a new version of the kernel, or that the message may change in some way
>> that the regex or other classification methods start to silently fail.
>
>Another is that printk() is not reliable on its own. Messages might
>get lost. The size of the log buffer is limited. Deamon reading
>/dev/kmsg need not be scheduled in time or often enough. Console
>might be slow. The messages are filtered on the console by console_loglevel.
This is of course true. We don't use kmsg as the last line of defence for
monitoring or remediation, of course, but it would be unwise to not have
infrastructure capable of monitoring it. We often need to act quickly when
production incidents happen, and often kmsg is the place where those
"unexpected" issues are surfaced. It's often much more likely that there is
some kmsg log which we can act on in those scenarios than anything else, and
even if it's not ideal, in reality, it's typically reliable enough to at least
mitigate the problem when dealing with a large fleet of machines :-)
>> # Format: <module>,<facility><level><format>\0
>> $ perl -p -e 's/\n/\\n/g;s/\0/\n/g' /proc/printk_formats | shuf -n 5
>> vmlinux,6Disabling APIC timer\n
>> intel_rapl_common,3intel_rapl_common: Cannot find matching power limit for constraint %d\n
>> dm_crypt,3device-mapper: crypt: %s: INTEGRITY AEAD ERROR, sector %llu\n
>> mac80211,6%s: AP bug: HT capability missing from AssocResp\n
>> vmlinux,3zpool: couldn't create zpool - out of memory\n
>
>The facility and log level are not well separated from the format string.
>
>Also this is yet another style how the format is displayed. We already have
>
> + console/syslog: formated by record_print_text()
> + /dev/kmsg: formatted by info_print_ext_header(), msg_print_ext_body().
> + /sys/kernel/debug/dynamic_debug/control
> + /sys/kernel/debug/tracing/printk_formats
>
>We should get some inspiration from the existing interfaces.
Sure, I'm not super bound to the format, as long as we have something that can
aid those maintaining these systems which monitor printk in identifying that a
format was mutated or removed. The module is more or less optional -- it's just
intended as a hint about where to look.
>But we first should decide what information might be useful:
>
> + 'facility' should not be needed. All messages should be from
> kernel.
That's fair enough, it can be omitted. I just didn't want to stray too far from
the netconsole format, since we already mostly have it in this format there.
My intention is to _not_ deviate from existing interfaces, really, so I'll be
happy with any suggested format that will achieve this patch's stated goals,
since this kind of data is sorely needed :-)
> + <module> is already optinaly added by pr_fmt() to the printed strings
> as: pr_fmt(): ...
pr_fmts are not consistently used across the kernel, and sometimes differ from
the module itself. Many modules don't use it at all, and we also don't have it
for pr_cont. Just picking some random examples:
% grep -av vmlinux /proc/printk_formats | shuf -n 10
mac80211,6%s: mesh STA %pM switches to channel requiring DFS (%d MHz, width:%d, CF1/2: %d/%d MHz), aborting
thinkpad_acpi,c N/Athinkpad_acpi,c %dthinkpad_acpi,5thinkpad_acpi: temperatures (Celsius):thinkpad_acpi,3thinkpad_acpi: Out of memory for LED data
i915,6drm/i915 developers can then reassign to the right component if it's not a kernel issue.
video,4[Firmware Bug]: _BCQ is used instead of _BQC
i915,3gvt: requesting SMI service
are MMIO SPTEs.
i915,3gvt: invalid tiling mode: %x
video,3ACPI: Create sysfs link
cec,6cec-%s: duplicate logical address type
soundwire_bus,3%s: %s: inconsistent state state %d
>> +static int proc_pf_show(struct seq_file *s, void *v)
>> +{
>> + const struct printk_fmt_sec *ps = NULL;
>> + const char **fptr = NULL;
>> +
>> + mutex_lock(&printk_fmts_mutex);
>> +
>> + list_for_each_entry(ps, &printk_fmts_list, list) {
>> + const char *mod_name = ps_get_module_name(ps);
>> +
>> + for (fptr = ps->start; fptr < ps->end; fptr++) {
>> + seq_puts(s, mod_name);
>> + seq_putc(s, ',');
>> + seq_puts(s, *fptr);
>> + seq_putc(s, '\0');
>> + }
>
>You probably should get inspiration from t_show() in trace_printk.c.
>It handles newlines, ...
>
>Or by ddebug_proc_show(). It uses seq_escape().
>
>Anyway, there is something wrong at the moment. The output looks fine
>with cat. But "less" says that it is a binary format and the output
>is a bit messy:
Hmm, why should that be a problem? It's intentional that this pretty much just
directly replicates the format string passed to printk, since it's easy to
write a parser for it:
1. Go up to the comma, take the module
2. Take the facility and level
3. Take the rest up to a \0 as the format
4. Go to 1
I don't mind to have it escaped, but I'm not immediately seeing the benefit. We
also don't escape `\0` in (for example) `/proc/pid/cmdline`, since it serves as
a good natural delimiter.
Thanks for taking the time to review :-)
Chris
On Fri, 5 Feb 2021 22:45:19 +0000
Chris Down <[email protected]> wrote:
> >I'm not against the idea. I don't think it belongs in /proc. Perhaps
> >debugfs is a better place to put it.
>
> Any location is fine with me, as long as it gets to userspace. How does
> <debugfs>/printk/formats or <debugfs>/printk/formats/<module> sound to you?
I'm fine with it, if others are.
Something like this will probably need approval from others on the Cc list
here.
-- Steve
On Fri, Feb 05, 2021 at 10:45:19PM +0000, Chris Down wrote:
> Hi Steven,
>
> Steven Rostedt writes:
> > Interesting, because when I was looking at the original patch (looked at
> > the lore link before reading your reply), I thought to myself "this looks
> > exactly like what I did for trace_printk formats", which the above file is
> > where it is shown. I'm curious if this work was inspired by that?
>
> The double __builtin_constant_p() trick was suggested by Johannes based on
> prior art in trace_puts() just prior to patch submission. Other than that,
> it seems we came up with basically the same solution independently. :-)
>
> > > Anyway, there is something wrong at the moment. The output looks fine
> > > with cat. But "less" says that it is a binary format and the output
> > > is a bit messy:
> >
> > Hmm, that's usually the case when lseek gets messed up. Not sure how that
> > happened.
>
> It looks as intended to me -- none of the newlines, nulls, or other control
> sequences are escaped currently, since I didn't immediately see a reason to
> do that. If that's a blocker though, I'm happy to change it.
>
> > > $> less /proc/printk_formats
> > > "/proc/printk_formats" may be a binary file. See it anyway?
> > > vmlinux,^A3Warning: unable to open an initial console.
> > > ^@vmlinux,^A3Failed to execute %s (error %d)
> > > ^@vmlinux,^A6Kernel memory protection disabled.
> > > ^@vmlinux,^A3Starting init: %s exists but couldn't execute it (error %d)
> > >
> > >
> > > That is for now. I still have to think about it. And I am also curious
> > > about what others thing about this idea.
> > >
> >
> > I'm not against the idea. I don't think it belongs in /proc. Perhaps
> > debugfs is a better place to put it.
>
> Any location is fine with me, as long as it gets to userspace. How does
> <debugfs>/printk/formats or <debugfs>/printk/formats/<module> sound to you?
That's fine with me, but I'd like to see the patch with this in it first
before approving it :)
thanks,
greg k-h
Greg Kroah-Hartman writes:
>> > I'm not against the idea. I don't think it belongs in /proc. Perhaps
>> > debugfs is a better place to put it.
>>
>> Any location is fine with me, as long as it gets to userspace. How does
>> <debugfs>/printk/formats or <debugfs>/printk/formats/<module> sound to you?
>
>That's fine with me, but I'd like to see the patch with this in it first
>before approving it :)
Thanks! I'll send v2 soon then with the aforementioned changes, and a move to
debugfs.
On Fri, 2021-02-05 at 22:25 +0000, Chris Down wrote:
> Petr Mladek writes:
> > ??+ <module> is already optinaly added by pr_fmt() to the printed strings
> > ????as: pr_fmt(): ...
>
> pr_fmts are not consistently used across the kernel, and sometimes differ from
> the module itself. Many modules don't use it at all, and we also don't have it
> for pr_cont. Just picking some random examples:
>
> ?????% grep -av vmlinux /proc/printk_formats | shuf -n 10
> ?????mac80211,6%s: mesh STA %pM switches to channel requiring DFS (%d MHz, width:%d, CF1/2: %d/%d MHz), aborting
> ?????thinkpad_acpi,c N/Athinkpad_acpi,c %dthinkpad_acpi,5thinkpad_acpi: temperatures (Celsius):thinkpad_acpi,3thinkpad_acpi: Out of memory for LED data
I don't understand this format.
"Out of memory for LED data" is a single printk ending with a '\n' newline
I expected this to be broken up into multiple lines, one for each printk
that endsd in a newline.
And what would happen if the function was refactored removing the pr_cont
uses like the below: (basically, any output that uses a mechanism that
aggregates a buffer then emits it, and there are a _lot_ of those)
printk("%s\n", buffer);
And there is already a relatively trivial way to do this using a modified
version of strings that looks for KERN_SOH[0-6], and if dynamic_debug is
enabled, look in the dynamic_debug section, either __verbose or __dyndbg
depending on the kernel version.
---
drivers/platform/x86/thinkpad_acpi.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c
index 18b390153e7f..ff1c09c600f8 100644
--- a/drivers/platform/x86/thinkpad_acpi.c
+++ b/drivers/platform/x86/thinkpad_acpi.c
@@ -6353,21 +6353,26 @@ static void thermal_dump_all_sensors(void)
{
int n, i;
struct ibm_thermal_sensors_struct t;
+ char output[256];
+ int len = 0;
n = thermal_get_sensors(&t);
if (n <= 0)
return;
- pr_notice("temperatures (Celsius):");
+ len += scnprintf(output + len, sizeof(output) - len,
+ "temperatures (Celsius):");
for (i = 0; i < n; i++) {
- if (t.temp[i] != TPACPI_THERMAL_SENSOR_NA)
- pr_cont(" %d", (int)(t.temp[i] / 1000));
+ if (t.temp[i] == TPACPI_THERMAL_SENSOR_NA)
+ len += scnprintf(output + len, sizeof(output) - len,
+ " N/A");
else
- pr_cont(" N/A");
+ len += scnprintf(output + len, sizeof(output) - len,
+ " %d", t.temp[i] / 1000);
}
- pr_cont("\n");
+ pr_notice("%s\n", output);
}
/* sysfs temp##_input -------------------------------------------------- */
Joe Perches writes:
>On Fri, 2021-02-05 at 22:25 +0000, Chris Down wrote:
>> Petr Mladek writes:
>> > ??+ <module> is already optinaly added by pr_fmt() to the printed strings
>> > ????as: pr_fmt(): ...
>>
>> pr_fmts are not consistently used across the kernel, and sometimes differ from
>> the module itself. Many modules don't use it at all, and we also don't have it
>> for pr_cont. Just picking some random examples:
>>
>> ?????% grep -av vmlinux /proc/printk_formats | shuf -n 10
>> ?????mac80211,6%s: mesh STA %pM switches to channel requiring DFS (%d MHz, width:%d, CF1/2: %d/%d MHz), aborting
>> ?????thinkpad_acpi,c N/Athinkpad_acpi,c %dthinkpad_acpi,5thinkpad_acpi: temperatures (Celsius):thinkpad_acpi,3thinkpad_acpi: Out of memory for LED data
>
>I don't understand this format.
>
>"Out of memory for LED data" is a single printk ending with a '\n' newline
>I expected this to be broken up into multiple lines, one for each printk
>that endsd in a newline.
Hmm, that's just a manifestation of directly using `shuf` without doing the
transformation of trailing nulls to newlines shown in the changelog. They are
still distinct and separated by nulls.
>And what would happen if the function was refactored removing the pr_cont
>uses like the below: (basically, any output that uses a mechanism that
>aggregates a buffer then emits it, and there are a _lot_ of those)
>
> printk("%s\n", buffer);
There are certainly printks which can't be trivially monitored using the printk
format alone, but the vast majority of the ones that are monitored _do_ have
meaningful formats and can be monitored over time. No solution to this is going
to catch every single case, especially when so much of the information can be
generated dyamically, but this patchset still goes a long way to making printk
monitoring more tractable for use cases like the one described in the
changelog.
On Sat, 2021-02-06 at 21:21 +0000, Chris Down wrote:
> Joe Perches writes:
> > On Fri, 2021-02-05 at 22:25 +0000, Chris Down wrote:
> > > Petr Mladek writes:
> > > > ??+ <module> is already optinaly added by pr_fmt() to the printed strings
> > > > ????as: pr_fmt(): ...
> > >
> > > pr_fmts are not consistently used across the kernel, and sometimes differ from
> > > the module itself. Many modules don't use it at all, and we also don't have it
> > > for pr_cont. Just picking some random examples:
> > >
> > > ?????% grep -av vmlinux /proc/printk_formats | shuf -n 10
> > > ?????mac80211,6%s: mesh STA %pM switches to channel requiring DFS (%d MHz, width:%d, CF1/2: %d/%d MHz), aborting
> > > ?????thinkpad_acpi,c N/Athinkpad_acpi,c %dthinkpad_acpi,5thinkpad_acpi: temperatures (Celsius):thinkpad_acpi,3thinkpad_acpi: Out of memory for LED data
> >
> > I don't understand this format.
> >
> > "Out of memory for LED data" is a single printk ending with a '\n' newline
> > I expected this to be broken up into multiple lines, one for each printk
> > that endsd in a newline.
>
> Hmm, that's just a manifestation of directly using `shuf` without doing the
> transformation of trailing nulls to newlines shown in the changelog. They are
> still distinct and separated by nulls.
>
> > And what would happen if the function was refactored removing the pr_cont
> > uses like the below: (basically, any output that uses a mechanism that
> > aggregates a buffer then emits it, and there are a _lot_ of those)
> >
> > printk("%s\n", buffer);
>
> There are certainly printks which can't be trivially monitored using the printk
> format alone, but the vast majority of the ones that are monitored _do_ have
> meaningful formats and can be monitored over time. No solution to this is going
> to catch every single case, especially when so much of the information can be
> generated dyamically, but this patchset still goes a long way to making printk
> monitoring more tractable for use cases like the one described in the
> changelog.
For the _vast_ majority of printk strings, this can easily be found
and compared using a trivial modification to strings.
Module specific formats are stored in the .ko files and could be
examined separately.
Here's the possible patch to strings:
---
binutils/strings.c | 98 +++++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 82 insertions(+), 16 deletions(-)
diff --git a/binutils/strings.c b/binutils/strings.c
index 3444fbc7222..d4abeb0cdbb 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -26,6 +26,10 @@
--data
-d Scan only the initialized data section(s) of object files.
+ --section=<section>
+ -S <section> Scan only the specific section(s)
+ --kernel
+ -k Scan only linux-kernel KERN_SOH strings
--print-file-name
-f Print the name of the file before each string.
@@ -108,6 +112,14 @@ static bfd_boolean print_filenames;
/* TRUE means for object files scan only the data section. */
static bfd_boolean datasection_only;
+/* TRUE means for object files scan only the specified sections. */
+static bfd_boolean specified_sections_only;
+static int specified_sections_count;
+static char **specified_sections;
+
+/* TRUE means scan only linux-kernel printk strings with KERN_SOH. */
+static bfd_boolean linux_kernel_soh;
+
/* The BFD object file format. */
static char *target;
@@ -122,6 +134,8 @@ static struct option long_options[] =
{
{"all", no_argument, NULL, 'a'},
{"data", no_argument, NULL, 'd'},
+ {"section", required_argument, NULL, 'S'},
+ {"kernel", no_argument, NULL, 'k'},
{"print-file-name", no_argument, NULL, 'f'},
{"bytes", required_argument, NULL, 'n'},
{"radix", required_argument, NULL, 't'},
@@ -173,7 +187,7 @@ main (int argc, char **argv)
encoding = 's';
output_separator = NULL;
- while ((optc = getopt_long (argc, argv, "adfhHn:wot:e:T:s:Vv0123456789",
+ while ((optc = getopt_long (argc, argv, "adS:s:kfhHn:wot:e:T:s:Vv0123456789",
long_options, (int *) 0)) != EOF)
{
switch (optc)
@@ -186,6 +200,17 @@ main (int argc, char **argv)
datasection_only = TRUE;
break;
+ case 'S':
+ specified_sections_only = TRUE;
+ specified_sections = xrealloc(specified_sections,
+ (specified_sections_count + 1) * sizeof(const char *));
+ specified_sections[specified_sections_count++] = optarg;
+ break;
+
+ case 'k':
+ linux_kernel_soh = TRUE;
+ break;
+
case 'f':
print_filenames = TRUE;
break;
@@ -318,6 +343,19 @@ main (int argc, char **argv)
return (exit_status);
}
+static bfd_boolean
+section_is_specified_section (asection *sect)
+{
+ int i;
+ for (i = 0; i < specified_sections_count; i++)
+ {
+ if (strcmp(specified_sections[i], sect->name) == 0) {
+ return TRUE;
+ }
+ }
+ return FALSE;
+}
+
/* Scan section SECT of the file ABFD, whose printable name is
FILENAME. If it contains initialized data set GOT_A_SECTION and
print the strings in it. */
@@ -329,7 +367,9 @@ strings_a_section (bfd *abfd, asection *sect, const char *filename,
bfd_size_type sectsize;
bfd_byte *mem;
- if ((sect->flags & DATA_FLAGS) != DATA_FLAGS)
+ if (specified_sections_only && !section_is_specified_section (sect))
+ return;
+ if (datasection_only && ((sect->flags & DATA_FLAGS) != DATA_FLAGS))
return;
sectsize = bfd_section_size (sect);
@@ -417,7 +457,7 @@ strings_file (char *file)
try to open it as an object file and only look at
initialized data sections. If that fails, fall back to the
whole file. */
- if (!datasection_only || !strings_object_file (file))
+ if (!(datasection_only || specified_sections_only) || !strings_object_file (file))
{
FILE *stream;
@@ -571,6 +611,7 @@ print_strings (const char *filename, FILE *stream, file_ptr address,
int stop_point, int magiccount, char *magic)
{
char *buf = (char *) xmalloc (sizeof (char) * (string_min + 1));
+ int is_kernel = 0;
while (1)
{
@@ -594,6 +635,7 @@ print_strings (const char *filename, FILE *stream, file_ptr address,
if (! STRING_ISGRAPHIC (c))
{
+ is_kernel = c == 1;
/* Found a non-graphic. Try again starting with next byte. */
unget_part_char (c, &address, &magiccount, &magic);
goto tryline;
@@ -601,6 +643,22 @@ print_strings (const char *filename, FILE *stream, file_ptr address,
buf[i] = c;
}
+ if ((linux_kernel_soh && !is_kernel) ||
+ (linux_kernel_soh && !strchr("01234567cd", buf[0])))
+ {
+ while (1) {
+ c = get_char (stream, &address, &magiccount, &magic);
+ if (c == EOF)
+ return;
+ if (! STRING_ISGRAPHIC (c))
+ {
+ unget_part_char (c, &address, &magiccount, &magic);
+ break;
+ }
+ }
+ goto tryline;
+ }
+
/* We found a run of `string_min' graphic characters. Print up
to the next non-graphic character. */
@@ -668,8 +726,11 @@ print_strings (const char *filename, FILE *stream, file_ptr address,
break;
}
- buf[i] = '\0';
- fputs (buf, stdout);
+ if (!linux_kernel_soh || is_kernel)
+ {
+ buf[i] = '\0';
+ fputs (buf, stdout);
+ }
while (1)
{
@@ -681,13 +742,20 @@ print_strings (const char *filename, FILE *stream, file_ptr address,
unget_part_char (c, &address, &magiccount, &magic);
break;
}
- putchar (c);
+ if (!linux_kernel_soh || is_kernel)
+ {
+ putchar (c);
+ }
}
- if (output_separator)
- fputs (output_separator, stdout);
- else
- putchar ('\n');
+ if (!linux_kernel_soh || is_kernel)
+ {
+ if (output_separator)
+ fputs (output_separator, stdout);
+ else
+ putchar ('\n');
+ }
+ is_kernel = 0;
}
free (buf);
}
@@ -702,13 +770,11 @@ usage (FILE *stream, int status)
if (DEFAULT_STRINGS_ALL)
fprintf (stream, _("\
-a - --all Scan the entire file, not just the data section [default]\n\
- -d --data Only scan the data sections in the file\n"));
- else
- fprintf (stream, _("\
+ -d --data Only scan the data sections in the file\n\
+ -S --section=[section] Only scan the specific sections\n\
+ -k --kernel Only scan for linux-kernel strings with KERN_SOH\n\
-a - --all Scan the entire file, not just the data section\n\
- -d --data Only scan the data sections in the file [default]\n"));
-
- fprintf (stream, _("\
+ -d --data Only scan the data sections in the file [default]\n\
-f --print-file-name Print the name of the file before each string\n\
-n --bytes=[number] Locate & print any NUL-terminated sequence of at\n\
-<number> least [number] characters (default 4).\n\
Joe Perches writes:
>> There are certainly printks which can't be trivially monitored using the printk
>> format alone, but the vast majority of the ones that are monitored _do_ have
>> meaningful formats and can be monitored over time. No solution to this is going
>> to catch every single case, especially when so much of the information can be
>> generated dyamically, but this patchset still goes a long way to making printk
>> monitoring more tractable for use cases like the one described in the
>> changelog.
>
>For the _vast_ majority of printk strings, this can easily be found
>and compared using a trivial modification to strings.
There are several issues with your proposed approach that make it unsuitable
for use as part of a reliable production environment:
1. It misses printk() formats without KERN_SOH
printk() formats without KERN_SOH are legal and use MESSAGE_LOGLEVEL_DEFAULT.
On my test kernel, your proposed patch loses >5% of printk formats -- over 200
messages -- due to this, including critical ones like those about hardware or
other errors.
2. Users don't always have the kernel image available
Many of our machines and many of the machines of others like us do not boot
using local storage, but instead use PXE or other technologies where the kernel
may not be stored during runtime.
As is described in the changelog, it is necessary to be able to vary
remediations not only based on what is already in /dev/kmsg, but also to be
able to make decisions about our methodology based on what's _supported_ in the
running kernel at runtime, and your proposed approach makes this not viable.
3. `KERN_SOH + level' can appear in other places than just printk strings
KERN_SOH is just ASCII '\001' -- it's not distinctive or unique, even when
paired with a check for something that looks like a level after it. For this
reason, your proposed patch results in a non-trivial amount of non-printk
related garbage in its output. For example:
% binutils/strings -k /tmp/vmlinux | head -5
3L)s
3L)s
c,[]A\
c(L)c
d$pL)d$`u4
Fundamentally, one cannot use a tool which just determines whether something is
printable to determine semantic intent.
4. strings(1) output cannot differentiate embedded newlines and new formats
The following has exactly the same output from strings(1), but will manifest
completely differently at printk() time:
printk(KERN_ERR "line one\nline two\nline three\n");
printk("line four\n");
With strings, the hypothetical output would be:*
3line one\nline two\nline three\nline four\n
* "line four\n" would also be missing with your current -k check.
But this makes it impossible to distinguish between this, compared to:
printk(KERN_ERR "line one\nline two\n");
printk("line three\n");
printk("line four\n");
The originally posted patch _does_ differentiate between these cases, using \0
as a reliable separator. Its outputs are, respectively:
\0013line one\nline two\nline three\0\nline four\n\0
\0013line one\nline two\n\0line three\nline four\n\0
This isn't just a theoretical concern -- there are plenty of places which use
multiline printks, and we must be able to distinguish between that and
_multiple_ printks. Not being able to differentiate cases like these would
dramatically reduce the effectiveness of printk enumeration, as we can no
longer ascertain which formats will always be used together (for example, in
the case of sequences of printks guarded by conditionals, which are all over
the place).
5. strings(1) is not contextually aware, and cannot be made to act as if it is
strings has no idea about what it is reading, which is why it is more than
happy to output the kind of meaningless output shown in #3. There are plenty of
places across the kernel where there might be a sequence of bytes which the
strings utility happens to interpret as being semantically meaningful, but in
reality just happens to be an unrelated sequence of coincidentally printable
bytes that just happens to contain a \001.
I appreciate your willingness to propose other solutions, but for these
reasons, the proposed strings(1) patch would not suffice as an interface for
printk enumeration.
On Sun, 2021-02-07 at 14:13 +0000, Chris Down wrote:
> Joe Perches writes:
> > > There are certainly printks which can't be trivially monitored using the printk
> > > format alone, but the vast majority of the ones that are monitored _do_ have
> > > meaningful formats and can be monitored over time. No solution to this is going
> > > to catch every single case, especially when so much of the information can be
> > > generated dyamically, but this patchset still goes a long way to making printk
> > > monitoring more tractable for use cases like the one described in the
> > > changelog.
> >
> > For the _vast_ majority of printk strings, this can easily be found
> > and compared using a trivial modification to strings.
>
> There are several issues with your proposed approach that make it unsuitable
> for use as part of a reliable production environment:
>
> 1. It misses printk() formats without KERN_SOH
>
> printk() formats without KERN_SOH are legal and use MESSAGE_LOGLEVEL_DEFAULT.
> On my test kernel, your proposed patch loses >5% of printk formats -- over 200
> messages -- due to this, including critical ones like those about hardware or
> other errors.
There are _very_ few of those printks without KERN_<level> and those
very few are not generally being changed.
> 2. Users don't always have the kernel image available
>
> Many of our machines and many of the machines of others like us do not boot
> using local storage, but instead use PXE or other technologies where the kernel
> may not be stored during runtime.
>
> As is described in the changelog, it is necessary to be able to vary
> remediations not only based on what is already in /dev/kmsg, but also to be
> able to make decisions about our methodology based on what's _supported_ in the
> running kernel at runtime, and your proposed approach makes this not viable.
Indirection would alway work.
You could load a separate file with output strings along with your
kernel image.
> 3. `KERN_SOH + level' can appear in other places than just printk strings
>
> KERN_SOH is just ASCII '\001' -- it's not distinctive or unique, even when
> paired with a check for something that looks like a level after it. For this
> reason, your proposed patch results in a non-trivial amount of non-printk
> related garbage in its output. For example:
>
> ?????% binutils/strings -k /tmp/vmlinux | head -5
> ?????3L)s
> ?????3L)s
> ?????c,[]A\
> ?????c(L)c
> ?????d$pL)d$`u4
>
> Fundamentally, one cannot use a tool which just determines whether something is
> printable to determine semantic intent.
$ kernel_strings --kernel --section ".rodata" vmlinux
I got exactly 0.
> 4. strings(1) output cannot differentiate embedded newlines and new formats
>
> The following has exactly the same output from strings(1), but will manifest
> completely differently at printk() time:
>
> ?????printk(KERN_ERR "line one\nline two\nline three\n");
> ?????printk("line four\n");
This is not the preferred output style and is only done in old and
unchanging code.
Your use case in your commit log is looking for _changed_ formats.
On Thu, 2021-02-04 at 15:37 +0000, Chris Down wrote:
> This patch provides a solution to the issue of silently changed or
> deleted printks:
Exactly _how_ many of these use cases do you think exist?
The generally preferred style for the example above would be:
pr_err("line one\n");
pr_err("line two\n");
pr_err("line three\n");
pr_err("line four\n");
> The originally posted patch _does_ differentiate between these cases, using \0
> as a reliable separator. Its outputs are, respectively:
>
> ?????\0013line one\nline two\nline three\0\nline four\n\0
> ?????\0013line one\nline two\n\0line three\nline four\n\0
>
> This isn't just a theoretical concern -- there are plenty of places which use
> multiline printks, and we must be able to distinguish between that and
> _multiple_ printks.
Just like there are many places that use buffered printks as the
example I gave earlier. None of which your proposed solution would find.
> 5. strings(1) is not contextually aware, and cannot be made to act as if it is
>
> strings has no idea about what it is reading, which is why it is more than
> happy to output the kind of meaningless output shown in #3. There are plenty of
> places across the kernel where there might be a sequence of bytes which the
> strings utility happens to interpret as being semantically meaningful, but in
> reality just happens to be an unrelated sequence of coincidentally printable
> bytes that just happens to contain a \001.
>
> I appreciate your willingness to propose other solutions, but for these
> reasons, the proposed strings(1) patch would not suffice as an interface for
> printk enumeration.
I think you are on a path to try to make printk output immutable.
I think that's a _very_ bad path.
I also think this is adding needless complexity.
A possible complexity I would like to support would be optionally
compressing printk format strings at compile time and uncompressing
them at use time.
Joe Perches writes:
>> There are several issues with your proposed approach that make it unsuitable
>> for use as part of a reliable production environment:
>>
>> 1. It misses printk() formats without KERN_SOH
>>
>> printk() formats without KERN_SOH are legal and use MESSAGE_LOGLEVEL_DEFAULT.
>> On my test kernel, your proposed patch loses >5% of printk formats -- over 200
>> messages -- due to this, including critical ones like those about hardware or
>> other errors.
>
>There are _very_ few of those printks without KERN_<level> and those
>very few are not generally being changed.
I already specified how many are lost: 5%. That's not "very few". That's a huge
proportion of the coverage afforded by this patch, including several important
cases.
Relying on "they generally don't change" is not a recipe for reliability or
success (and they do change, more data on that below).
>> 2. Users don't always have the kernel image available
>>
>> Many of our machines and many of the machines of others like us do not boot
>> using local storage, but instead use PXE or other technologies where the kernel
>> may not be stored during runtime.
>>
>> As is described in the changelog, it is necessary to be able to vary
>> remediations not only based on what is already in /dev/kmsg, but also to be
>> able to make decisions about our methodology based on what's _supported_ in the
>> running kernel at runtime, and your proposed approach makes this not viable.
>
>Indirection would alway work.
>
>You could load a separate file with output strings along with your
>kernel image.
You're moving the goalposts quite quickly here, which makes it harder to reply
to your points. Now you're proposing an entirely separate distribution path,
compared to interfaces that we already have precedent for in the kernel (eg.
trace_printk). That requires a strong justification, and I'm not seeing one
here.
>> 3. `KERN_SOH + level' can appear in other places than just printk strings
>>
>> KERN_SOH is just ASCII '\001' -- it's not distinctive or unique, even when
>> paired with a check for something that looks like a level after it. For this
>> reason, your proposed patch results in a non-trivial amount of non-printk
>> related garbage in its output. For example:
>>
>> ?????% binutils/strings -k /tmp/vmlinux | head -5
>> ?????3L)s
>> ?????3L)s
>> ?????c,[]A\
>> ?????c(L)c
>> ?????d$pL)d$`u4
>>
>> Fundamentally, one cannot use a tool which just determines whether something is
>> printable to determine semantic intent.
>
>$ kernel_strings --kernel --section ".rodata" vmlinux
>
>I got exactly 0.
"It works on my computer" is not a valid testing methodology, especially for
something as complex as the Linux kernel. It's especially not a valid rebuttal
to someone demonstrating that it clearly doesn't work on theirs.
Even filtering to the .rodata section, there's plenty of garbage just in the
first five cases:
% binutils/strings --kernel --section ".rodata" /tmp/vmlinux | head -5
3******* Your BIOS seems to not contain a fix for K8 errata #93
1>pBC)
dTRAC
6Run %s as init process
7calling %pS @ %i
Clearly there are cases that you are not considering. My kernel config is
attached if you want to try and replicate, but regardless, it's really not
valid to say "it works for me" in response to someone showing that it doesn't.
>> 4. strings(1) output cannot differentiate embedded newlines and new formats
>>
>> The following has exactly the same output from strings(1), but will manifest
>> completely differently at printk() time:
>>
>> ?????printk(KERN_ERR "line one\nline two\nline three\n");
>> ?????printk("line four\n");
>
>This is not the preferred output style and is only done in old and
>unchanging code.
>
>Your use case in your commit log is looking for _changed_ formats.
Joe, it's fine to present alternatives to people's patches, but please do your
research before spouting things like this. It's a waste of everyone's time to
refute things which are so easily demonstrated to be false.
Here are a bunch of recent changes to printk I found just from literally 2
minutes of looking through `git log`:
- ea34f78f3df6: 2020, printk site deleted (which of course we also need to know.)
- a0f6d924cada: 2020, new callsite. the level is printed dynamically, so your proposed patch would not match.
- bf13718bc57a: 2020, existing printk changed.
- 994388f228c6: 2020, printk site changed to au0828_isocdbg, reworded entirely.
- a8b62fd08505: 2020, new callsite, dynamic level.
I could find literally pages and pages of these just from the last few years.
Your belief that these printks are only in "unchanging" code does not match
reality.
>On Thu, 2021-02-04 at 15:37 +0000, Chris Down wrote:
>> This patch provides a solution to the issue of silently changed or
>> deleted printks:
>
>Exactly _how_ many of these use cases do you think exist?
>
>The generally preferred style for the example above would be:
>
> pr_err("line one\n");
> pr_err("line two\n");
> pr_err("line three\n");
> pr_err("line four\n");
I have no idea why you think this is so rare -- we have mixed pr_* and
unadorned printk() all over the codebase. A number of the patches I just gave
above are in files with mixed calls.
>> The originally posted patch _does_ differentiate between these cases, using \0
>> as a reliable separator. Its outputs are, respectively:
>>
>> ?????\0013line one\nline two\nline three\0\nline four\n\0
>> ?????\0013line one\nline two\n\0line three\nline four\n\0
>>
>> This isn't just a theoretical concern -- there are plenty of places which use
>> multiline printks, and we must be able to distinguish between that and
>> _multiple_ printks.
>
>Just like there are many places that use buffered printks as the
>example I gave earlier. None of which your proposed solution would find.
There are always going to be cases which are not caught. The point is that the
patch proposed in this thread captures significantly more cases than the
`strings` case (not to mention that it avoids outputting garbage from .rodata),
not that it covers every imaginable scenario.
>> 5. strings(1) is not contextually aware, and cannot be made to act as if it is
>>
>> strings has no idea about what it is reading, which is why it is more than
>> happy to output the kind of meaningless output shown in #3. There are plenty of
>> places across the kernel where there might be a sequence of bytes which the
>> strings utility happens to interpret as being semantically meaningful, but in
>> reality just happens to be an unrelated sequence of coincidentally printable
>> bytes that just happens to contain a \001.
>>
>> I appreciate your willingness to propose other solutions, but for these
>> reasons, the proposed strings(1) patch would not suffice as an interface for
>> printk enumeration.
>
>I think you are on a path to try to make printk output immutable.
>I think that's a _very_ bad path.
That's literally the opposite of what this patchset does. This patchset
offloads the responsibility of worrying about userspace parsers breaking
because of changes to kernel printks, because those userspace parsers and
maintainers now have a mechanism to detect changes. If anything, it _reduces_
the risk of what you're describing.
Chris Down writes:
>>>3. `KERN_SOH + level' can appear in other places than just printk strings
>>>
>>>KERN_SOH is just ASCII '\001' -- it's not distinctive or unique, even when
>>>paired with a check for something that looks like a level after it. For this
>>>reason, your proposed patch results in a non-trivial amount of non-printk
>>>related garbage in its output. For example:
>>>
>>>?????% binutils/strings -k /tmp/vmlinux | head -5
>>>?????3L)s
>>>?????3L)s
>>>?????c,[]A\
>>>?????c(L)c
>>>?????d$pL)d$`u4
>>>
>>>Fundamentally, one cannot use a tool which just determines whether something is
>>>printable to determine semantic intent.
>>
>>$ kernel_strings --kernel --section ".rodata" vmlinux
>>
>>I got exactly 0.
>
>"It works on my computer" is not a valid testing methodology,
>especially for something as complex as the Linux kernel. It's
>especially not a valid rebuttal to someone demonstrating that it
>clearly doesn't work on theirs.
>
>Even filtering to the .rodata section, there's plenty of garbage just
>in the first five cases:
>
> % binutils/strings --kernel --section ".rodata" /tmp/vmlinux | head -5
> 3******* Your BIOS seems to not contain a fix for K8 errata #93
> 1>pBC)
> dTRAC
> 6Run %s as init process
> 7calling %pS @ %i
>
>Clearly there are cases that you are not considering. My kernel config
>is attached if you want to try and replicate, but regardless, it's
>really not valid to say "it works for me" in response to someone
>showing that it doesn't.
Attached.
Hi Chris,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on jeyu/modules-next]
[also build test ERROR on linux/master soc/for-next openrisc/for-next powerpc/next asm-generic/master linus/master v5.11-rc6 next-20210125]
[cannot apply to tip/x86/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Chris-Down/printk-Userspace-format-enumeration-support/20210204-234240
base: https://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux.git modules-next
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/08a6039db35c32e39de288d58d0362013fdfb387
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Chris-Down/printk-Userspace-format-enumeration-support/20210204-234240
git checkout 08a6039db35c32e39de288d58d0362013fdfb387
# save the attached .config to linux build tree
make W=1 ARCH=um SUBARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>
All errors (new ones prefixed by >>):
/usr/bin/ld: arch/um/drivers/xterm.o: in function `xterm_open':
>> xterm.c:(.text+0x16b): undefined reference to `printk'
>> /usr/bin/ld: xterm.c:(.text+0x1a8): undefined reference to `printk'
/usr/bin/ld: xterm.c:(.text+0x1f4): undefined reference to `printk'
/usr/bin/ld: xterm.c:(.text+0x25e): undefined reference to `printk'
/usr/bin/ld: xterm.c:(.text+0x2b5): undefined reference to `printk'
/usr/bin/ld: arch/um/drivers/xterm.o:xterm.c:(.text+0x31c): more undefined references to `printk' follow
collect2: error: ld returned 1 exit status
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]
kernel test robot writes:
>All errors (new ones prefixed by >>):
>
> /usr/bin/ld: arch/um/drivers/xterm.o: in function `xterm_open':
>>> xterm.c:(.text+0x16b): undefined reference to `printk'
>>> /usr/bin/ld: xterm.c:(.text+0x1a8): undefined reference to `printk'
> /usr/bin/ld: xterm.c:(.text+0x1f4): undefined reference to `printk'
> /usr/bin/ld: xterm.c:(.text+0x25e): undefined reference to `printk'
> /usr/bin/ld: xterm.c:(.text+0x2b5): undefined reference to `printk'
> /usr/bin/ld: arch/um/drivers/xterm.o:xterm.c:(.text+0x31c): more undefined references to `printk' follow
> collect2: error: ld returned 1 exit status
Ah yes, since it just uses it as extern. I'll fix it for v2.