MIME-Version: 1.0
In-Reply-To: <20160413190152.GA29753@mail.hallyn.com>
References: <20160321234133.GA22463@mail.hallyn.com> <20160413175736.GC3676@htj.duckdns.org>
 <20160413184639.GA29483@mail.hallyn.com> <20160413185033.GH3676@htj.duckdns.org>
 <20160413190152.GA29753@mail.hallyn.com>
From: Aditya Kali <adityakali@google.com>
Date: Wed, 13 Apr 2016 16:31:07 -0700
Message-ID: <CAGr1F2HXJ1BdMFY+vF40O_khE+4S7OnbQPv-h1Q_AmGGhL7mzw@mail.gmail.com>
Subject: Re: [RFC PATCH] cgroup namespaces: add a 'nsroot=' mountinfo field
To: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Tejun Heo <tj@kernel.org>, Linux API <linux-api@vger.kernel.org>,
        Linux Containers <containers@lists.osdl.org>,
        "Eric W. Biederman" <ebiederm@xmission.com>,
        cgroups mailinglist <cgroups@vger.kernel.org>,
        lkml <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2038
Lines: 42

On Wed, Apr 13, 2016 at 12:01 PM, Serge E. Hallyn <serge@hallyn.com> wrote:
> Quoting Tejun Heo (tj@kernel.org):
>> Hello, Serge.
>>
>> On Wed, Apr 13, 2016 at 01:46:39PM -0500, Serge E. Hallyn wrote:
>> > It's not a leak of any information we're trying to hide.  I realize
>> > something like 8 years have passed, but I still basically go by the
>> > ksummit guidance that containers are ok but the kernel's first priority
>> > is to facilitate containers but not trick containers into thinking
>> > they're not containerized.  So long as the container is properly set
>> > up, I don't think there's anything the workload could do with the
>> > nsroot= info other than *know* that it is in a ns cgroup.
>> >
>> > If we did change that guidance, there's a slew of proc info that we
>> > could better virtualize :)
>>
>> I see.  I'm just wondering because the information here seems a bit
>> gratuituous.  Isn't the only thing necessary telling whether the root
>> is bind mounted or namescoped?  Wouldn't simple "nsroot" work for that
>> purpose?
>
> I don't think so - we could be in a cgroup namespace but still have
> access only to bind-mounted cgroups.  So we need to compare the
> superblock dentry root field to the nsroot= value.

Umm, I don't think this is such a good idea. The main purpose of
cgroup namespace was to prevent this exposure of system cgroup
hierarchy that used to happen because of /proc/self/cgroup. Wouldn't
showing that information in /proc/self/mountinfo defeat the purpose?

> One practical problem I've found with cgroup namespaces is that there
> is no way to disambiguate between a cgroupfs mount which was done in
> a cgroup namespace, and a bind mount of a cgroupfs directory.

Thats actually by design, no? Namespaced apps should not know/care if
they are running inside namespace. If they can find it out today, its
just because of certain side-effects. I fear adding explicit "nsroot"
or something in /proc/self/mountinfo now becomes an API making it hard
to virtualize user-apps again.

-- 
Aditya