LinuxLists.cc - [PATCH v2] printk: Have printk() never buffer its data

2012-06-25 13:21:49

Subject: [PATCH v2] printk: Have printk() never buffer its data

Fengguang Wu had a config that caused the system to lockup. It reported:

[ 6.086395] type=2000 audit(1339501575.085:1): initialized
[ 6.116356] Kprobe smoke test started
[ 6.125915] Kprobe smoke test passed successfully
[ 6.127478] rcu-torture:--- Start of test: nreaders=2 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0
+fqs_holdoff=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0

and then froze. So naturally, the suspected bug was with rcu-torture.
Fengguang did a git bisect and discovered that the crash came with a
function trace update. He also noticed that if he reverted that update,
the system got farther and showed:

[ 1.611901] Testing tracer function: PASSED

His time was wasted by the fact that the function tracing self test
first prints:

"Testing tracer function: "

then runs the test, and if it succeeds, it prints "PASSED", giving us
the nice output you see above.

But with the new printk() changes, text without a newline gets buffered
and does not print out to the console at the location of the printk.
This caused the "Testing tracer function: " not to print out and because
the test caused the kernel to lock up, we are clueless to the fact that
the problem was with the function tracer test and not the rcu_torture
test. As it made sense that the rcu_torture test could lock up the
system, many kernel developer hours were wasted chasing the wrong wild
goose.

If the "Testing tracer function: " had printed out in the first place,
we would have caught the correct wild goose and saved precious kernel
developer's time.

This brings back the old way of printk() that always prints to the console
and does not buffer the data. As printk_emit() is used by other 'facilities'
this only affects printk(), but keeps pretty much the new behavior.

The printk() state of the last print is kept around to know how to print
the new lines. If the printk does not have a new line, the state is
stored in msg_print_text() and if a prefix is to be printed next, it
forces a newline. This keeps the new behavior when multiple partial prints are
happening across CPUs. Each one will have its own line (like it does today).

I hooked onto the 'facility' infrastructure and used '0x1ffc' (the max
number) as a way to communicate printk() commands to the msg_print_text()
which is performed at a later time (console_unlock()).

I tested this out, and now the lockup shows:

[ 9.018231] Kprobe smoke test passed successfully
[ 9.023084] rcu-torture:--- Start of test: nreaders=4 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_hold
off=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
[ 9.066065] Testing tracer function:

To prevent the flush to cause the next printk to have a timestamp that
would produce the following:

[ 6.834073] Testing tracer function: [ 7.136194] PASSED

I made the KERN_CONT ("<c>") use the facility as well to pass info to
not print the timestamp. This fixes the abve issue to print:

[ 1.291025] Testing tracer function: PASSED

When multiple SMP calls take place, this happens:

[ 1.310024] Testing tracer function: PASSED
[ 1.648504] Testing dynamic ftrace: PASSED
[ 2.099752] Testing dynamic ftrace ops #1: (1 0 1 1 0) (1 1 2 1 0)
[ 2.247911] Refined TSC clocksource calibration: 2659.999 MHz.
[ 2.254229] Switching to clocksource tsc
(2 1 3 1 22) (2 2 4 1 4044) PASSED
[ 2.572013] Testing dynamic ftrace ops #2: (1 0 1 11 0) (1 1 2 4623 0) (2 1 3 10096 170) (2 2 4 16097 6162) PASSED
[ 3.106015] Testing tracer irqsoff: PASSED

Link: http://lkml.kernel.org/r/[email protected]

Cc: Linus Torvalds <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "kay.sievers" <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Fengguang Wu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
---
kernel/printk.c | 99 ++++++++++++++++++++++++++++++++++++++++++++++++------
1 files changed, 88 insertions(+), 11 deletions(-)

diff --git a/kernel/printk.c b/kernel/printk.c
index 32462d2..15267cd 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -237,6 +237,24 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
static char *log_buf = __log_buf;
static u32 log_buf_len = __LOG_BUF_LEN;

+/*
+ * The max number for facility is 0xffff >> 3, or 0x1fff.
+ * As log facilities count upward from 1, we have the kernel
+ * keep the top numbers or itself.
+ *
+ * 0x1ffc is the code for kernel processing in msg->level (facility)
+ * bit 0 - do not print prefix
+ * bit 1 - do not print newline
+ */
+#define LOG_KERNEL 0x1ffc
+#define LOG_NONL_NOPREFIX_MSK 3
+#define LOG_NONL_SET 2
+#define LOG_NOPREFIX_SET 1
+
+#define LOG_NONL_NOPREFIX (LOG_KERNEL | LOG_NONL_NOPREFIX_MSK)
+#define LOG_NONL (LOG_KERNEL | LOG_NONL_SET)
+#define LOG_NOPREFIX (LOG_KERNEL | LOG_NOPREFIX_SET)
+
/* cpu currently holding logbuf_lock */
static volatile unsigned int logbuf_cpu = UINT_MAX;

@@ -836,14 +854,45 @@ static size_t msg_print_text(const struct log *msg, bool syslog,
}

if (buf) {
- if (print_prefix(msg, syslog, NULL) +
- text_len + 1>= size - len)
- break;
+ static bool last_newline = true;
+ bool newline = true;
+ bool prefix = true;
+ int facility = msg->level >> 3;
+
+ /*
+ * The kernel sends some commands via the facility.
+ * To do so, a high number mask is used (LOG_KERNEL)
+ * and the low bits of the mask hold the command bits
+ * that the kernel printk() will use to state how the
+ * msg will be printed.
+ */
+ if ((facility & LOG_KERNEL) == LOG_KERNEL) {
+ if (facility & LOG_NOPREFIX_SET)
+ prefix = false;
+ if (facility & LOG_NONL_SET)
+ newline = false;
+ }
+
+ if (prefix) {
+ /*
+ * Force newline, for last line if this line
+ * is printing out a prefix.
+ */
+ if (!last_newline)
+ buf[len++] = '\n';
+
+ if (print_prefix(msg, syslog, NULL) +
+ text_len + 1 >= size - len)
+ break;
+
+ len += print_prefix(msg, syslog, buf + len);
+ }

- len += print_prefix(msg, syslog, buf + len);
memcpy(buf + len, text, text_len);
len += text_len;
- buf[len++] = '\n';
+ if (newline)
+ buf[len++] = '\n';
+ last_newline = newline;
} else {
/* SYSLOG_ACTION_* buffer size only calculation */
len += print_prefix(msg, syslog, NULL);
@@ -1267,6 +1316,7 @@ asmlinkage int vprintk_emit(int facility, int level,
static char cont_buf[LOG_LINE_MAX];
static size_t cont_len;
static int cont_level;
+ static bool cont_prefix;
static struct task_struct *cont_task;
static char textbuf[LOG_LINE_MAX];
char *text = textbuf;
@@ -1275,8 +1325,12 @@ asmlinkage int vprintk_emit(int facility, int level,
int this_cpu;
bool newline = false;
bool prefix = false;
+ bool flush;
int printed_len = 0;

+ /* output from printk() always flush to console (no line buffering) */
+ flush = facility == 0;
+
boot_delay_msec();
printk_delay();

@@ -1350,8 +1404,13 @@ asmlinkage int vprintk_emit(int facility, int level,
newline = true;
}

+ /* Force prefix for new task */
+ if (cont_task != current)
+ prefix = true;
+
if (!newline) {
- if (cont_len && (prefix || cont_task != current)) {
+
+ if (cont_len && prefix) {
/*
* Flush earlier buffer, which is either from a
* different thread, or when we got a new prefix.
@@ -1363,13 +1422,29 @@ asmlinkage int vprintk_emit(int facility, int level,
if (!cont_len) {
cont_level = level;
cont_task = current;
+ cont_prefix = prefix;
}

- /* buffer or append to earlier buffer from the same thread */
- if (cont_len + text_len > sizeof(cont_buf))
- text_len = sizeof(cont_buf) - cont_len;
- memcpy(cont_buf + cont_len, text, text_len);
- cont_len += text_len;
+ /* For printk() callers, always output to the console */
+ if (flush) {
+ int code = LOG_NONL;
+
+ /*
+ * If the buffered string was KERN_CONT,
+ * do not print prefix.
+ */
+ if (!prefix)
+ code = LOG_NONL_NOPREFIX;
+
+ log_store(code, cont_level, NULL, 0, text, text_len);
+ } else {
+
+ /* buffer or append to earlier buffer from the same thread */
+ if (cont_len + text_len > sizeof(cont_buf))
+ text_len = sizeof(cont_buf) - cont_len;
+ memcpy(cont_buf + cont_len, text, text_len);
+ cont_len += text_len;
+ }
} else {
if (cont_len && cont_task == current) {
if (prefix) {
@@ -1395,6 +1470,8 @@ asmlinkage int vprintk_emit(int facility, int level,
printed_len = cont_len;
} else {
/* ordinary single and terminated line */
+ if (!prefix && !facility && cont_task == current)
+ facility = LOG_NOPREFIX;
log_store(facility, level,
dict, dictlen, text, text_len);
printed_len = text_len;
--
1.7.3.4

2012-06-25 13:33:58

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

On Mon, 2012-06-25 at 09:21 -0400, Steven Rostedt wrote:

> This brings back the old way of printk() that always prints to the console
> and does not buffer the data. As printk_emit() is used by other 'facilities'
> this only affects printk(), but keeps pretty much the new behavior.

Note, this patch brings back the old printk() behavior of not buffering
the output to the console, without the need of printk_flush() or any
helper function, and it does so without a revert of the changes that Kay
has made.

Even with Kay's changes, if multiple partial lines are printed on SMP,
those would be interleaved as well, as the new method can only save one
partial line at a time. Only full lines will not flush out the buffer.

At least with my patch, if two interleaving printks were coming out at
the same time, they would each get their own line (better than the old
way).

We could probably even add a task id output if necessary, and an
external tool could easily put them back together. Only print the task
id if a partial line was preempted by another task. But that would come
after a later debate.

I'm thinking that this patch is the best of both worlds. Most of Kay's
work is still intact, and printk() still behaves as we expect it to.

-- Steve

2012-06-25 13:56:18

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

* Steven Rostedt <[email protected]> wrote:

> [...]
>
> The printk() state of the last print is kept around to know
> how to print the new lines. If the printk does not have a new
> line, the state is stored in msg_print_text() and if a prefix
> is to be printed next, it forces a newline. This keeps the new
> behavior when multiple partial prints are happening across
> CPUs. Each one will have its own line (like it does today).
>
> I hooked onto the 'facility' infrastructure and used '0x1ffc'
> (the max number) as a way to communicate printk() commands to
> the msg_print_text() which is performed at a later time
> (console_unlock()).
>
> I tested this out, and now the lockup shows:
>
> [ 9.018231] Kprobe smoke test passed successfully
> [ 9.023084] rcu-torture:--- Start of test: nreaders=4 nfakewriters=4 stat_interval=0 verbose=0 test_no_idle_hz=0 shuffle_interval=3 stutter=5 irqreader=1 fqs_duration=0 fqs_hold
> off=0 fqs_stutter=3 test_boost=1/0 test_boost_interval=7 test_boost_duration=4 shutdown_secs=0 onoff_interval=0 onoff_holdoff=0
> [ 9.066065] Testing tracer function:

Ok, this is *much* better behavior, out of box.

A couple of comments:

> @@ -836,14 +854,45 @@ static size_t msg_print_text(const struct log *msg, bool syslog,
> }
>
> if (buf) {
> - if (print_prefix(msg, syslog, NULL) +
> - text_len + 1>= size - len)
> - break;
> + static bool last_newline = true;

I'd suggest to move this last_newline flag up to the logbuf_lock
block of global variables - it belongs there. Statics are easily
overlooked and maybe something else running under the
logbuf_lock will want to access this variable in the future.

> + bool newline = true;
> + bool prefix = true;
> + int facility = msg->level >> 3;
> +
> + /*
> + * The kernel sends some commands via the facility.
> + * To do so, a high number mask is used (LOG_KERNEL)
> + * and the low bits of the mask hold the command bits
> + * that the kernel printk() will use to state how the
> + * msg will be printed.
> + */
> + if ((facility & LOG_KERNEL) == LOG_KERNEL) {
> + if (facility & LOG_NOPREFIX_SET)
> + prefix = false;
> + if (facility & LOG_NONL_SET)
> + newline = false;
> + }

I suspect using a separate command flag and passing it along
would be somewhat cleaner - but no strong objections against
this approach either.

> + if (prefix) {
> + /*
> + * Force newline, for last line if this line
> + * is printing out a prefix.
> + */
> + if (!last_newline)
> + buf[len++] = '\n';
> +
> + if (print_prefix(msg, syslog, NULL) +
> + text_len + 1 >= size - len)
> + break;
> +
> + len += print_prefix(msg, syslog, buf + len);
> + }

Just a side note, this is a problem that exists in the new
devkmsg_user code, message size limit handling of
devkmsg_user->buf[] is non-existent and conditions for and
protections against triggering overflow are unclear - right now
it's probably addressed by making the buffer excessively large:

struct devkmsg_user {
u64 seq;
u32 idx;
struct mutex lock;
char buf[8192];
};

but this may eventually have to be addressed - various things
like newline insertion or automatic escaping can enlargen the
buffer - if an attacker ever has control over a large enough
printk'ed text then this is a potential root hole.

>
> - len += print_prefix(msg, syslog, buf + len);
> memcpy(buf + len, text, text_len);
> len += text_len;
> - buf[len++] = '\n';
> + if (newline)
> + buf[len++] = '\n';
> + last_newline = newline;
> } else {
> /* SYSLOG_ACTION_* buffer size only calculation */
> len += print_prefix(msg, syslog, NULL);
> @@ -1267,6 +1316,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> static char cont_buf[LOG_LINE_MAX];
> static size_t cont_len;
> static int cont_level;
> + static bool cont_prefix;
> static struct task_struct *cont_task;
> static char textbuf[LOG_LINE_MAX];

argh. So the vprintk_emit() muck introduced its own large set of
function local statics? Taste fail, really ...

> char *text = textbuf;
> @@ -1275,8 +1325,12 @@ asmlinkage int vprintk_emit(int facility, int level,
> int this_cpu;
> bool newline = false;
> bool prefix = false;
> + bool flush;
> int printed_len = 0;
>
> + /* output from printk() always flush to console (no line buffering) */
> + flush = facility == 0;

While your code is correct, this pattern is easily mistaken for
the 'a = b = c' pattern, so I'd suggest writing it as:

flush = (facility == 0);

Anyway, bike shed painting aside, the patch looks like a
workable solution to me.

Thanks,

Ingo

2012-06-25 14:26:12

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

On Mon, 2012-06-25 at 15:56 +0200, Ingo Molnar wrote:

> > @@ -836,14 +854,45 @@ static size_t msg_print_text(const struct log *msg, bool syslog,
> > }
> >
> > if (buf) {
> > - if (print_prefix(msg, syslog, NULL) +
> > - text_len + 1>= size - len)
> > - break;
> > + static bool last_newline = true;
>
> I'd suggest to move this last_newline flag up to the logbuf_lock
> block of global variables - it belongs there. Statics are easily
> overlooked and maybe something else running under the
> logbuf_lock will want to access this variable in the future.

Will do.

>
> > + bool newline = true;
> > + bool prefix = true;
> > + int facility = msg->level >> 3;
> > +
> > + /*
> > + * The kernel sends some commands via the facility.
> > + * To do so, a high number mask is used (LOG_KERNEL)
> > + * and the low bits of the mask hold the command bits
> > + * that the kernel printk() will use to state how the
> > + * msg will be printed.
> > + */
> > + if ((facility & LOG_KERNEL) == LOG_KERNEL) {
> > + if (facility & LOG_NOPREFIX_SET)
> > + prefix = false;
> > + if (facility & LOG_NONL_SET)
> > + newline = false;
> > + }
>
> I suspect using a separate command flag and passing it along
> would be somewhat cleaner - but no strong objections against
> this approach either.

Yeah, I hate passing info via the facility, but I didn't want to bloat
the struct log any more. As the msg_print_text() is run separately from
printk, via the console_unlock(), the struct log is the only way to send
information from printk_emit() to msg_print_text().

Right now we have:

struct log {
u64 ts_nsec; /* timestamp in nanoseconds */
u16 len; /* length of entire record */
u16 text_len; /* length of text buffer */
u16 dict_len; /* length of dictionary buffer */
u16 level; /* syslog level + facility */
};

This is actually saved on the printk buffer (just not printed). Maybe a
better idea would be to break up the level more formally:

instead of:

msg->level = (facility << 3) | (level & 7);

we could do:

msg->level = (flags << 12) | ((facility & 0x1ff) << 3) | (level & 7);

and pass in the flags separately.

Yeah, I think this may be a cleaner solution.

>
> > + if (prefix) {
> > + /*
> > + * Force newline, for last line if this line
> > + * is printing out a prefix.
> > + */
> > + if (!last_newline)
> > + buf[len++] = '\n';
> > +
> > + if (print_prefix(msg, syslog, NULL) +
> > + text_len + 1 >= size - len)
> > + break;
> > +
> > + len += print_prefix(msg, syslog, buf + len);
> > + }
>
> Just a side note, this is a problem that exists in the new
> devkmsg_user code, message size limit handling of
> devkmsg_user->buf[] is non-existent and conditions for and
> protections against triggering overflow are unclear - right now
> it's probably addressed by making the buffer excessively large:
>
> struct devkmsg_user {
> u64 seq;
> u32 idx;
> struct mutex lock;
> char buf[8192];
> };
>
> but this may eventually have to be addressed - various things
> like newline insertion or automatic escaping can enlargen the
> buffer - if an attacker ever has control over a large enough
> printk'ed text then this is a potential root hole.

Agreed, this part made me a little nervous about buffer overflow too.

>
> >
> > - len += print_prefix(msg, syslog, buf + len);
> > memcpy(buf + len, text, text_len);
> > len += text_len;
> > - buf[len++] = '\n';
> > + if (newline)
> > + buf[len++] = '\n';
> > + last_newline = newline;
> > } else {
> > /* SYSLOG_ACTION_* buffer size only calculation */
> > len += print_prefix(msg, syslog, NULL);
> > @@ -1267,6 +1316,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> > static char cont_buf[LOG_LINE_MAX];
> > static size_t cont_len;
> > static int cont_level;
> > + static bool cont_prefix;
> > static struct task_struct *cont_task;
> > static char textbuf[LOG_LINE_MAX];
>
>
> argh. So the vprintk_emit() muck introduced its own large set of
> function local statics? Taste fail, really ...
>
> > char *text = textbuf;
> > @@ -1275,8 +1325,12 @@ asmlinkage int vprintk_emit(int facility, int level,
> > int this_cpu;
> > bool newline = false;
> > bool prefix = false;
> > + bool flush;
> > int printed_len = 0;
> >
> > + /* output from printk() always flush to console (no line buffering) */
> > + flush = facility == 0;
>
> While your code is correct, this pattern is easily mistaken for
> the 'a = b = c' pattern, so I'd suggest writing it as:
>
> flush = (facility == 0);

Yeah, that looks better. Will fix.

>
> Anyway, bike shed painting aside, the patch looks like a
> workable solution to me.

Great! Lets hope Kay feels the same way.

-- Steve

2012-06-25 15:22:47

by Kay Sievers

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

On Mon, Jun 25, 2012 at 4:26 PM, Steven Rostedt <[email protected]> wrote:
> On Mon, 2012-06-25 at 15:56 +0200, Ingo Molnar wrote:

>> Anyway, bike shed painting aside, the patch looks like a
>> workable solution to me.
>
> Great! Lets hope Kay feels the same way.

Hmm, what I still miss, is how the log record export should work
properly with unbuffered continuation lines.

The buffering of continuation line users resulted in a clean one
record per logged line. Now we will get all separated records for all
continuation prints of a single line, which would either need to be
merged in userspace, or internally.

Non-buffered continuation lines and and a record buffer don't play
that well with each other. It also sounds like quite a lot of wasted
headers in the buffer, which we need to carry around and throw away
when we reconstruct the line for output again. If we merge them
internally we mess around with the sequence numbers, if we merge them
in userspace we would need to export the flags to do that.

Kay

2012-06-25 15:55:45

by Steven Rostedt

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

On Mon, 2012-06-25 at 17:22 +0200, Kay Sievers wrote:
> On Mon, Jun 25, 2012 at 4:26 PM, Steven Rostedt <[email protected]> wrote:
> > On Mon, 2012-06-25 at 15:56 +0200, Ingo Molnar wrote:
>
> >> Anyway, bike shed painting aside, the patch looks like a
> >> workable solution to me.
> >
> > Great! Lets hope Kay feels the same way.
>
> Hmm, what I still miss, is how the log record export should work
> properly with unbuffered continuation lines.
>
> The buffering of continuation line users resulted in a clean one
> record per logged line. Now we will get all separated records for all
> continuation prints of a single line, which would either need to be
> merged in userspace, or internally.

Merge it in userspace.

>
> Non-buffered continuation lines and and a record buffer don't play
> that well with each other.

Yeah, I know, as they seem to be a problem together in the coding.

> It also sounds like quite a lot of wasted
> headers in the buffer, which we need to carry around and throw away
> when we reconstruct the line for output again. If we merge them
> internally we mess around with the sequence numbers, if we merge them
> in userspace we would need to export the flags to do that.

As I'm still on debian and fedora 14, I have to admit that I do not
quite understand exactly what you are trying to do with systemd or your
logging facility. But it seems to become clear that printk isn't a tool
for it.

What exactly are you trying to record from the kernel? Device
information, kernel diagnostics, or something else? Maybe there should
be another facility besides printk that can pass what you want to your
utilities.

printk is the first line of attack for kernel developers to find their
bugs. If it becomes bloated and focused on users, it will make
development of the kernel much more difficult.

Perhaps we should have added a hook or a tracepoint at the beginning of
printk that your logging facility could attach to, and then it could do
logging and buffering in any manner it would like, instead of hijacking
the best debugging tool that kernel developers use.

I'm starting to side with Ingo that the new printk should be reverted
and rethought, and find another answer to the fixes that it solved.

-- Steve

2012-06-26 20:44:06

by Ingo Molnar

[permalink] [raw]

Subject: Re: [PATCH v2] printk: Have printk() never buffer its data

* Steven Rostedt <[email protected]> wrote:

> > It also sounds like quite a lot of wasted headers in the
> > buffer, which we need to carry around and throw away when we
> > reconstruct the line for output again. If we merge them
> > internally we mess around with the sequence numbers, if we
> > merge them in userspace we would need to export the flags to
> > do that.
>
> As I'm still on debian and fedora 14, I have to admit that I
> do not quite understand exactly what you are trying to do with
> systemd or your logging facility. But it seems to become clear
> that printk isn't a tool for it.
>
> What exactly are you trying to record from the kernel? Device
> information, kernel diagnostics, or something else? Maybe
> there should be another facility besides printk that can pass
> what you want to your utilities.
>
> printk is the first line of attack for kernel developers to
> find their bugs. If it becomes bloated and focused on users,
> it will make development of the kernel much more difficult.
>
> Perhaps we should have added a hook or a tracepoint at the
> beginning of printk that your logging facility could attach
> to, [...]

That was my immediate suggestion very early on in the printk
discussion, months ago, when Kay was still giving us nightmares
about UUID printk's and other nonsense.

Adding the printk tracepoint and feeding it to kmesg would have
been a few trivial lines of code plus tracing code reuse, it
would have made a ton of sense and it would have enhanced
tracing and profiling as well, not just kmdesg/syslogd-ng.

I guess the triviality of it must have been one of the reasons
why it wasn't implemented that way ;-)

Today I'd be happy with at least having clean separation of a
direct, relatively simple printk() from whatever special purpose
new logging facilities feed off of printk().

Thanks,

Ingo