Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754982AbcLNIdZ (ORCPT ); Wed, 14 Dec 2016 03:33:25 -0500 Received: from mail-wj0-f194.google.com ([209.85.210.194]:33480 "EHLO mail-wj0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752492AbcLNIdV (ORCPT ); Wed, 14 Dec 2016 03:33:21 -0500 Subject: Re: Documenting the ioctl interfaces to discover relationships between namespaces To: "Eric W. Biederman" References: <87poky5ca9.fsf@xmission.com> <6771af94-9847-0277-ec1d-62bc3649a17a@gmail.com> <87r35df1u4.fsf@xmission.com> Cc: mtk.manpages@gmail.com, Andrei Vagin , Containers , Linux API , lkml , "linux-fsdevel@vger.kernel.org" , James Bottomley , "W. Trevor King" , Alexander Viro , "Serge E. Hallyn" From: "Michael Kerrisk (man-pages)" Message-ID: <154f7f6e-106d-6340-0dcd-e268525f435d@gmail.com> Date: Wed, 14 Dec 2016 08:32:45 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <87r35df1u4.fsf@xmission.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3984 Lines: 92 On 12/12/2016 07:18 PM, Eric W. Biederman wrote: > "Michael Kerrisk (man-pages)" writes: > >> On 12/11/2016 11:30 PM, Eric W. Biederman wrote: >>> "Michael Kerrisk (man-pages)" writes: >>> >>>> [was: [PATCH 0/4 v3] Add an interface to discover relationships >>>> between namespaces] >>> >>> One small comment below. >>> >>>> >>>> Introspecting namespace relationships >>>> Since Linux 4.9, two ioctl(2) operations are provided to allow >>>> introspection of namespace relationships (see user_namespaces(7) >>>> and pid_namespaces(7)). The form of the calls is: >>>> >>>> ioctl(fd, request); >>>> >>>> In each case, fd refers to a /proc/[pid]/ns/* file. >>>> >>>> NS_GET_USERNS >>>> Returns a file descriptor that refers to the owning user >>>> namespace for the namespace referred to by fd. >>>> >>>> NS_GET_PARENT >>>> Returns a file descriptor that refers to the parent names‐ >>>> pace of the namespace referred to by fd. This operation is >>>> valid only for hierarchical namespaces (i.e., PID and user >>>> namespaces). For user namespaces, NS_GET_PARENT is synony‐ >>>> mous with NS_GET_USERNS. >>>> >>>> In each case, the returned file descriptor is opened with O_RDONLY >>>> and O_CLOEXEC (close-on-exec). >>>> >>>> By applying fstat(2) to the returned file descriptor, one obtains >>>> a stat structure whose st_ino (inode number) field identifies the >>>> owning/parent namespace. This inode number can be matched with >>>> the inode number of another /proc/[pid]/ns/{pid,user} file to >>>> determine whether that is the owning/parent namespace. >>> >>> Like all fstat inode comparisons to be fully accurate you need to >>> compare both the st_ino and st_dev. I reserve the right for st_dev to >>> be significant when comparing namespaces. Otherwise I might have to >>> create a namespace of namespaces someday and that is ugly. >>> >>>> Either of these ioctl(2) operations can fail with the following >>>> error: >>>> >>>> EPERM The requested namespace is outside of the caller's names‐ >>>> pace scope. This error can occur if, for example, the own‐ >>>> ing user namespace is an ancestor of the caller's current >>>> user namespace. It can also occur on attempts to obtain >>>> the parent of the initial user or PID namespace. >>>> >>>> Additionally, the NS_GET_PARENT operation can fail with the fol‐ >>>> lowing error: >>>> >>>> EINVAL fd refers to a nonhierarchical namespace. >>>> >>>> See the EXAMPLE section for an example of the use of these opera‐ >>>> tions. >> >> So, after playing with this a bit, I have a question. >> >> I gather that in order to, for example, elaborate the tree of user >> namespaces on the system, one would use NS_GET_PARENT on each of >> the /proc/*/ns/user files and match up the results. Right? >> >> What happens if one of the parent user namespaces contains no >> processes? That is, the parent namespace exists by virtue of being >> pinned because a proc/PID/ns/user file is open or bind mounted. >> (Chrome seems to do this sort of dance with user namespaces, for >> example.) How do we find the ancestor of *that* user namespace? > > What is returned from NS_GET_USERNS and NS_GET_PARENT is a file > descriptor, that you can call NS_GET_PARENT on. Thanks, Eric. While trying to solve the small task I set myself, and probably confused by past discussions[1], I was overlooking the obvious. Cheers, Michael [1] https://lkml.org/lkml/2016/7/28/365 -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/