Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933347AbcKPRaI (ORCPT ); Wed, 16 Nov 2016 12:30:08 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:59688 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932995AbcKPRaD (ORCPT ); Wed, 16 Nov 2016 12:30:03 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Hari Bathini Cc: ast@fb.com, peterz@infradead.org, lkml , acme@kernel.org, alexander.shishkin@linux.intel.com, mingo@redhat.com, daniel@iogearbox.net, rostedt@goodmis.org, Ananth N Mavinakayanahalli , sargun@sargun.me, Aravinda Prasad , brendan.d.gregg@gmail.com References: <147877784354.29988.8570048236764105701.stgit@hbathini.in.ibm.com> <87a8d7m805.fsf@xmission.com> <7f1d2f36-7bfc-dc97-0de8-f8a3203ca26e@linux.vnet.ibm.com> Date: Wed, 16 Nov 2016 11:27:28 -0600 In-Reply-To: <7f1d2f36-7bfc-dc97-0de8-f8a3203ca26e@linux.vnet.ibm.com> (Hari Bathini's message of "Tue, 15 Nov 2016 17:51:09 +0530") Message-ID: <87lgwjfi7z.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1c7423-0006hF-Rm;;;mid=<87lgwjfi7z.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=75.170.125.99;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/ANHtHWEX3tzgXr8Cpcac+tF0X/fSAsw4= X-SA-Exim-Connect-IP: 75.170.125.99 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa03 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa03 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Hari Bathini X-Spam-Relay-Country: X-Spam-Timing: total 1013 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 3.7 (0.4%), b_tie_ro: 2.6 (0.3%), parse: 1.29 (0.1%), extract_message_metadata: 7 (0.7%), get_uri_detail_list: 4.2 (0.4%), tests_pri_-1000: 6 (0.6%), tests_pri_-950: 2.0 (0.2%), tests_pri_-900: 1.73 (0.2%), tests_pri_-400: 48 (4.7%), check_bayes: 46 (4.5%), b_tokenize: 16 (1.5%), b_tok_get_all: 11 (1.1%), b_comp_prob: 6 (0.6%), b_tok_touch_all: 9 (0.9%), b_finish: 1.01 (0.1%), tests_pri_0: 915 (90.3%), check_dkim_signature: 0.88 (0.1%), check_dkim_adsp: 10 (1.0%), tests_pri_500: 7 (0.7%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH 0/3] perf: add support for analyzing events for containers X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3085 Lines: 66 Hari Bathini writes: > On Friday 11 November 2016 01:18 AM, Eric W. Biederman wrote: >> Hari Bathini writes: >> >>> Currently, there is no trivial mechanism to analyze events based on >>> containers. perf -G can be used, but it will not filter events for the >>> containers created after perf is invoked, making it difficult to assess/ >>> analyze performance issues of multiple containers at once. >>> >>> This patch-set overcomes this limitation by using cgroup identifier as >>> container unique identifier. A new PERF_RECORD_NAMESPACES event that >>> records namespaces related info is introduced, from which the cgroup >>> namespace's inode number is used as cgroup identifier. This is based >>> on the assumption that each container is created with it's own cgroup >>> namespace allowing assessment/analysis of multiple containers using >>> cgroup identifier. >>> >>> The first patch introduces PERF_RECORD_NAMESPACES in kernel while the >>> second patch makes the corresponding changes in perf tool to read this >>> PERF_RECORD_NAMESPACES events. The third patch adds a cgroup identifier >>> column in perf report, which is nothing but the cgroup namespace's >>> inode number. This approach is based on the suggestion from Peter >>> Zijlstra here: https://patchwork.kernel.org/patch/9305655/ >> Where is the check that ensures that only the someone with >> capable(CAP_SYS_ADMIN) can use this interface. This interface is not >> namespace clean in multiple dimensions so it can not be used generally? > > Right. Will add the check.. > >> You are not allowed to move struct mount_namespace into >> include/linux/mnt_namespace.h. Al Viro will crucify you with cause. >> Those are implementation details the rest of the kernel should not be >> digging into. > > Ouch! How about adding an accessor function(s) in fs/namespace.c ..? For reasonable things of course. I think the namespace operations from ns common already has a large set of accessors so I don't know what you are looking for. >> Where are the device numbers that go with those inode numbers you are >> exporting? For now all of those inodes live on the filesystem but I am >> not giving guarantees to userspace that do not work for ordinary >> filesystems. > > Sorry! I didn't get this.. > Want to use these numbers as identity for namespace (like pid for process..) Yes I understand you would like to have a global identifier like pids. A global identifier would ultimately require the addition of a namespace of namespaces so the global identifier would be relative to something. I really don't want to go there. Global identifiers are evil! So you need specify not only the inode number but also which filesystem the inode number applies to. Aka the device number of the appropriate filesystem as well. Also please don't forget that modern inode numbers are 64bit not 32bit. I don't know if that freedom will be used with namespaces or not, but we need the freedom in a userspace API to make that change without breaking userspace. Eric