Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754965AbdGJX3p (ORCPT ); Mon, 10 Jul 2017 19:29:45 -0400 Received: from sub5.mail.dreamhost.com ([208.113.200.129]:45182 "EHLO homiemail-a83.g.dreamhost.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754897AbdGJX3o (ORCPT ); Mon, 10 Jul 2017 19:29:44 -0400 Date: Mon, 10 Jul 2017 16:29:43 -0700 From: Krister Johansen To: Arnaldo Carvalho de Melo Cc: Krister Johansen , Thomas-Mich Richter , Brendan Gregg , Peter Zijlstra , Ingo Molnar , Alexander Shishkin , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 tip/perf/core 1/6] perf symbols: find symbols in different mount namespace Message-ID: <20170710232943.GD6865@templeofstupid.com> References: <20170705204511.GD29683@templeofstupid.com> <1499305693-1599-1-git-send-email-kjlx@templeofstupid.com> <1499305693-1599-2-git-send-email-kjlx@templeofstupid.com> <20170706194130.GM27350@kernel.org> <20170707193640.GA2554@templeofstupid.com> <381cf00c-c540-8c20-7182-ecdd94f2d81c@linux.vnet.ibm.com> <20170710223924.GC6865@templeofstupid.com> <20170710225249.GC27350@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170710225249.GC27350@kernel.org> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4386 Lines: 79 On Mon, Jul 10, 2017 at 07:52:49PM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Jul 10, 2017 at 03:39:25PM -0700, Krister Johansen escreveu: > > On Mon, Jul 10, 2017 at 08:17:00AM +0200, Thomas-Mich Richter wrote: > > > On 07/07/2017 09:36 PM, Krister Johansen wrote: > > > > On Thu, Jul 06, 2017 at 04:41:30PM -0300, Arnaldo Carvalho de Melo wrote: > > > >> Em Wed, Jul 05, 2017 at 06:48:08PM -0700, Krister Johansen escreveu: > > > >>> Teach perf how to resolve symbols from binaries that are in a different > > > >>> mount namespace from the tool. This allows perf to generate meaningful > > > >>> stack traces even if the binary resides in a different mount namespace > > > >>> from the tool. > > > >> > > > >> I was trying to find a way to test after applying each of the patches in > > > >> this series, when it ocurred to me that if a process that appears on a > > > >> perf.data file has exit, how can we access /proc/%ITS_PID/something? > > > > > > > > You're correct. We can't access /proc//whatever once the process > > > > has exited. That was the impeteus for patches 4 and 6, which allow us > > > > to capture the binary (and debuginfo, if it exists) into the buildid > > > > cache so that if we do have a trace that exists after a process or > > > > container exists, we'll still be able to resolve some of the symbols. > > > > Any ideas on how to extend this to be able to resolve symbols after > > > the process/container exited? > > > I believe it boils down on how to interpret the mnt inode number in the > > > PERF_RECORD_NAMESPACE record... > > > Can this be done post-mortem? Maybe the PERF_RECORD_NAMESPACE record > > > has to contain more data than just the inode number? > > > I think we're talking past one another. If the container exits then the > > inode numbers that identify mount namespace are referring to something > > that is no longer valid. There's no mount namespace to enter in order > > to locate the binary objects. They may be on a volume that's no longer > > mounted. > > > I have a pair of patches in the existing set that copies the binary > > objects into the buildid cache. This lets you resolve the symbols after > > the container has exited, provided that you recorded the buildids during > > the trace. > > > If you apply all the patches in this set, you should be able to generate > > traces that you can look at with script or report even after the process > > has exited. I've been able to do it in my tests, at least. > > I will work on testing them soon, I just wanted this discussion to take > place, what you did seems to be the best we can do with the existing > kernel infrastructure, and is a clear advance, so we need to test and > merge it. Happy to have the discussion. Aplologies if having the patches iteratively add to one another isn't the best way to have this reviewed and understood. If you just apply the first few, you don't get the support to pull these into the build-id cache. > Getting the build-ids for the binaries is the key here, then its just a > matter of populating a database where to get the matching binaries, we > wouldn't need even to copy the actual binaries at record time. Unfortunately, it's not sufficient to save the path to the target binary because it's possible that after the container exits, and the namespace is destroyed, there may be no path that describes to the host how to access the files in the container. There are two different interactions here that frustrate this: 1. Containers run under a pivoted root, so the containers view of the path may be different from the host's view of the path. E.g. /usr/bin/node in the container may actually be /var/container_a/root/usr/bin/node, or something like that. However, see #2. 2. It's also entirely possible for a container to have mounted a filesystem that's not accessible or mounted from the host. If, for example, you're using docker with the direct-lvm storage driver, then your storage device may be mounted in the vfs attached to the container, but have no mount in the host's vfs. In a situation like this, once the container exits, the that lvm filesystem is unmounted. In order to access the files in that container, you basically need to setns(2) into the container's mount namespace and look up the files using the a path that resolves in the mount namespace of perf's target. -K