Date: Fri, 14 Jul 2017 10:33:40 +0200
From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Andres Freund <andres@anarazel.de>, x86@kernel.org,
        linux-kernel@vger.kernel.org, live-patching@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Andy Lutomirski <luto@kernel.org>, Jiri Slaby <jslaby@suse.cz>,
        "H. Peter Anvin" <hpa@zytor.com>, Mike Galbraith <efault@gmx.de>,
        Jiri Olsa <jolsa@redhat.com>,
        Arnaldo Carvalho de Melo <acme@infradead.org>,
        Namhyung Kim <namhyung@kernel.org>,
        Alexander Shishkin <alexander.shishkin@linux.intel.com>
Subject: Re: [PATCH v3 00/10] x86: ORC unwinder (previously undwarf)
Message-ID: <20170714083340.neiavkoxazrljlos@gmail.com>
References: <cover.1499786555.git.jpoimboe@redhat.com>
 <20170712214920.5droainfqjmq7sgu@alap3.anarazel.de>
 <20170712223225.zkq7tdb7pzgb3wy7@treble>
 <20170713071253.a3slz3j5tcgy3rkk@hirez.programming.kicks-ass.net>
 <20170713085015.yjjv5ig2znplx5jl@hirez.programming.kicks-ass.net>
 <20170713085114.h4vjgg7jjbl6dohb@hirez.programming.kicks-ass.net>
 <20170713091911.aj7e7dvrbqcyxh7l@gmail.com>
 <20170713121755.hsuvecrzvyxbdvvk@treble>
 <20170713122115.p62q7ua6gtuapfep@treble>
 <20170713123539.t6az4pnrdjewjvms@treble>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170713123539.t6az4pnrdjewjvms@treble>
User-Agent: NeoMutt/20170113 (1.7.2)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1442
Lines: 32


* Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> > > The results wouldn't be 100% accurate, but they could end up being useful 
> > > over time.
> > 
> > And to expound further on the bad idea, maybe the "bad" addresses could be 
> > filtered out somehow in post-processing (insert lots of hand waving).
> 
> And some details on the post-processing: in most cases it should be possible to 
> determine which of the found stack addresses are valid by looking at the call 
> instructions immediately preceding the stack text addresses, and making sure the 
> call target points to the same function as the previously found address.  But of 
> course that wouldn't work for indirect calls.

I believe this is similar to how OProfile did graph/dwarf profiling, by saving a 
copy of the stack and post-processing it.

By my best recollection (but I haven't used OProfile that much) it was both a 
performance nightmare, was limited (because it only saved a part of the stack), 
and was rather fragile as well, because it depended on the task VM being 
post-processable.

I think the highest quality implementation is to generate the call trace either in 
hardware (LBR), or as close to the event as possible: generate the kernel call 
chain in the PMI context, and the user-space call chain before user-space executes 
again (at the latest). Call chain generation should be roughly O(chain_depth), 
which both FP and ORC ensures.

Thanks,

	Ingo