2012-06-27 06:09:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c


(Cc:-ed a more interested folks and lkml, as this really matters
to more people than just me)

* Kay Sievers <[email protected]> wrote:

> mind moving "fixing up alternatives" in:
> arch/x86/kernel/alternative.c:431
> down to where really happens something, or whatever other solution
> fits, maybe it can be removed today ...
>
> The CPU code in:
> arch/x86/kernel/smpboot.c:654
> tries to print continuation lines, which look weird with the
> prefix-less lockdep warning that gets merged into the former line when
> lockdep is enabled:
>
> [ 0.053070] lockdep: fixing up alternatives.
> [ 0.054003] Booting Node 0, Processors #1lockdep: fixing up alternatives.
> [ 0.128995] #2lockdep: fixing up alternatives.
> [ 0.203985] #3lockdep: fixing up alternatives.
> [ 0.278972] #4lockdep: fixing up alternatives.
> [ 0.353960] #5lockdep: fixing up alternatives.
> [ 0.427948] #6lockdep: fixing up alternatives.
> [ 0.501937] #7 Ok.
> [ 0.575952] Brought up 8 CPUs

We could certainly change the lockdep printouts - but I'm really
not sure it's wise to fight those kinds of printk() usage
patterns:

Piecemail wise printouts are handy and natural as they follow
actual code flow. As the code progresses so do the continued
printk()s get progressed. We use single-line printouts to not
use up too much screen real estate during bootup.

So ... why not do what I suggested very, very early on: don't
touch printk screen output itself *AT ALL* - keep it simple and
direct, don't add complexity to the pathway of kernel code
trying to communicate a kernel bug to the user.

Instead solve your needs by simply tracing printk() calls and
the resulting strings [with a string event - I think Steve
posted an RFC patch for that early on - apply that patch and
recover the events - we could build-in this portion of tracing
whenever CONFIG_PRINTK is enabled to make sure the facility is
always available] and reconstruct the lines and the full
ASCII-string events in user-space: if PID,CPU is traced and
there's one record per printk() call then it's *trivial* to
reconstruct 99.9% of the printk lines in the user-space code
that receives the trace events.

Keep /proc/kmsg and don't change it much - but trace printk()
calls form your new systemd code. WHY don't you use that
approach? It's actually totally sensible, useful, it keeps the
printk status quo in place, improves other usecases and can be
done using existing facilities...

Thanks,

Ingo


2012-06-27 10:49:27

by Kay Sievers

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c

On Wed, Jun 27, 2012 at 8:09 AM, Ingo Molnar <[email protected]> wrote:
>> [    0.353960]  #5lockdep: fixing up alternatives.
>> [    0.427948]  #6lockdep: fixing up alternatives.
>> [    0.501937]  #7 Ok.
>> [    0.575952] Brought up 8 CPUs
>
> We could certainly change the lockdep printouts - but I'm really
> not sure it's wise to fight those kinds of printk() usage
> patterns:
>
> Piecemail wise printouts are handy and natural as they follow
> actual code flow. As the code progresses so do the continued
> printk()s get progressed. We use single-line printouts to not
> use up too much screen real estate during bootup.
>
> So ... why not do what I suggested very, very early on: don't
> touch printk screen output itself *AT ALL* - keep it simple and
> direct, don't add complexity to the pathway of kernel code
> trying to communicate a kernel bug to the user.

Not sure if I follow, this is exactly the case of: "We use single-line
printouts to not use up too much screen real estate" you mention. What
happens here is that the CPU code does not want hundreds of lines on
big boxes, hence it does try to fold them into one line with CONT
print. The lockdep print just prints a full terminated line
*in-between* every CONT of the stream.

As said, it's cosmetic, but it's obviously ugly and wrong to do that.
That's all. I'm not interested in the text it prints, it will not hurt
anything if we don't change it, it just looks ugly.

> Instead solve your needs by simply tracing printk() calls and
> the resulting strings [with a string event - I think Steve
> posted an RFC patch for that early on - apply that patch and
> recover the events - we could build-in this portion of tracing
> whenever CONFIG_PRINTK is enabled to make sure the facility is
> always available] and reconstruct the lines and the full
> ASCII-string events in user-space: if PID,CPU is traced and
> there's one record per printk() call then it's *trivial* to
> reconstruct 99.9% of the printk lines in the user-space code
> that receives the trace events.

As mentioned in the earlier discussion weeks ago, I'm not convinced
that tracing is the replacement for general purpose system logging.
Tracing to me is more an optional analysis and debugging tool to me,
not do much something we make mandatory now, that must work for every
little box out there, without any further configuration.

We just want a more reliable kernel logging, and for some logs the
context it was created in, very much like dev_printk() was supposed to
do, just in a working manner.

What we do already for userspace service logging, we want to integrate
the kernel logs for the related devices too:
https://plus.google.com/108087225644395745666/posts/iuFEgBZT3co

This is mainly about being able to *filter* the log, relate it to a
context, and query it at a later from userspace tools. The main
consumer is still the admin or human in front of it, but without the
grep and regex.

> Keep /proc/kmsg and don't change it much - but trace printk()
> calls form your new systemd code. WHY don't you use that
> approach? It's actually totally sensible, useful, it keeps the
> printk status quo in place, improves other usecases and can be
> done using existing facilities...

Why would we want another buffer for the same data? I really don't see
the benefit that much, but I might miss it.

Kmsg to us, is the counterpart of syslog, and that seems just good
enough, when messages do not interleave anymore, have sequence
numbers, and some simple device/context annotation. The kernel needs
to maintain and buffer them anyway and all that comes for almost free.

All we want is a reliable printk, and that is not only nice when we
retrieve the logs, it's also nice for internal things like ksmg_dump()
or any other user that was fiddling with possible dangling pointers in
the past in the old kmsg byte buffer.

I personally could understand if we wanted to replace the entire kmsg
buffering with the tracing buffer code, and make the trace buffer
mandatory that way. That would, in theory at least, unify things. But
I really don't see at the moment how we could require another entirely
independent facility to just make "plain old syslog" work better.

Thanks,
Kay

2012-06-27 11:06:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c

On Wed, 2012-06-27 at 12:49 +0200, Kay Sievers wrote:
> All we want is a reliable printk

We too, but we define reliable differently. I want printk() to dump data
ASAP so that it has better chance to get out of the system and onto my
console in case of a crash. I don't give a rats arse about interleaved
text, any text is better than no text.

So if you go add separate buffers in there that will delay writing text,
you're actively destroying printk().

Any change that makes it harder to get text out is bad.

2012-06-27 11:12:49

by Kay Sievers

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c

On Wed, Jun 27, 2012 at 1:06 PM, Peter Zijlstra <[email protected]> wrote:
> On Wed, 2012-06-27 at 12:49 +0200, Kay Sievers wrote:
>> All we want is a reliable printk
>
> We too, but we define reliable differently. I want printk() to dump data
> ASAP so that it has better chance to get out of the system and onto my
> console in case of a crash. I don't give a rats arse about interleaved
> text, any text is better than no text.

Sure, I fully understand that this is useful.

> So if you go add separate buffers in there that will delay writing text,
> you're actively destroying printk().
>
> Any change that makes it harder to get text out is bad.

The text is always flushed immediately for full lines, there is no
change at all. Since yesterday, continuation lines (which are
buffered) are flushed immediately to the console too.

I don't think there will be a substantial difference anymore to the
old behaviour, but we still get reliable records stored in the kmsg
buffer.

Kay

2012-06-27 14:44:54

by Joe Perches

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c

On Wed, 2012-06-27 at 13:12 +0200, Kay Sievers wrote:
> Since yesterday, continuation lines (which are
> buffered) are flushed immediately to the console too.

Has Greg KH's boot hang been resolved?

2012-06-29 12:58:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: cosmetic printk() issue with lockdep warning in arch/x86/kernel/alternative.c


* Peter Zijlstra <[email protected]> wrote:

> On Wed, 2012-06-27 at 12:49 +0200, Kay Sievers wrote:
> > All we want is a reliable printk
>
> We too, but we define reliable differently. I want printk() to
> dump data ASAP so that it has better chance to get out of the
> system and onto my console in case of a crash. I don't give a
> rats arse about interleaved text, any text is better than no
> text.

I actually *fixed* bugs in the past when I saw interleaved text
printed from multiple CPUs, it told me how narrow and SMP
dependent a particular race/crash was.

So even interleaving, although annoying most of the time, can be
useful in some other places.

Thanks,

Ingo