Date: Sat, 11 Feb 2012 15:38:09 +0100
From: Ingo Molnar <mingo@elte.hu>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Peter Zijlstra <a.p.zijlstra@chello.nl>, paulus@samba.org,
        cjashfor@linux.vnet.ibm.com, fweisbec@gmail.com,
        linux-kernel@vger.kernel.org,
        "James E.J. Bottomley" <jejb@parisc-linux.org>,
        Jan Blunck <jblunck@suse.de>
Subject: Re: [RFC 0/5] kernel: backtrace unwind support
Message-ID: <20120211143809.GA19713@elte.hu>
References: <1328873119-21553-1-git-send-email-jolsa@redhat.com>
 <1328895795.25989.29.camel@laptop>
 <CA+55aFxgPXjGh0GSHaUGm6-Pfdjjk=PAP7HMuZHcFGE92VutUQ@mail.gmail.com>
 <20120210192714.GE4998@infradead.org>
 <20120210194426.GA17650@elte.hu>
 <20120210201850.GA26892@m.redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20120210201850.GA26892@m.redhat.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2242
Lines: 55


* Jiri Olsa <jolsa@redhat.com> wrote:

> > I had a quick peek and I don't think it's constructed in a 
> > resilent enough form right now. For example there's no clear 
> > separation and checking of what comes from GCC and what not.
> 
> yes, there's nothing like this in now, I'll see what can be 
> done about that..

Another resilience feature of lockdep is the 'one strike and you 
are out!' aspect: the first error or unexpected condition we 
detect results in the very quick shutting down of all things 
lockdep. It prints exactly one error message, then it 
deactivates and never ever runs again.

The equivalent of this in the scope of your dwarf unwind kernel 
feature would be to fall back to the regular guess and 
framepointer based stack backtrace method the moment any error 
is detected.

Maybe print a single line that indicates that the fallback has 
been activated, and after that the dwarf code should never run 
again. Make sure nobody comes away a "oh, no, the dwarf unwind 
messed up things!' impression, even if it *does* run into some 
trouble (such as unexpected debuginfo generated by GCC - or 
debuginfo *corrupted* by a kernel bug [a very real 
possibility]).

What is totally unacceptable is for the dwarf code to *cause* 
crashes, or to destroy stack trace information.

> yep, looks interesting.. not sure about the mathematical proof 
> though ;)

In the physical sense even mathematics is always and unavoidably 
probability based (or brain and all our senses are 
probabilistic), so you can probably replace 'mathematical proof' 
with 'very robust design and a very, very good track record', 
before bothering Linus with it next time around ;-)

And we might as well conclude "it's simply not worth it", at 
some point down he road. I *do* think that it's worth it though, 
and I do think it can be designed and implemented robustly, so 
I'd be willing to try out these patches in -tip for a kernel 
release or two, without pushing it to Linus.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/