Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp1988315ybg; Thu, 30 Jul 2020 07:43:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzHKQETgbOIFqT28UWUHtNzq3u5/9nZOJBnLlc4m8E2AIkMXIof/EG2EW3uSiw7JJIrgBgC X-Received: by 2002:a17:906:3842:: with SMTP id w2mr3010570ejc.273.1596120230851; Thu, 30 Jul 2020 07:43:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596120230; cv=none; d=google.com; s=arc-20160816; b=tHtG2sUSQpizJzp8yNwx9JabRv6S3Xn54FGFFbfuNaydXEERzrpyhCuKnDRjkNoNqg PBT1ovt0cokaCTX8JeTyY4tSjrLSVuiUmi1FRshHO8RNnt4l5jed26a8sJjHzB2a7s3B JwMM/0L9EFxxFgCdKhw9XK7VeqDshN2OHqNGsYJtI6UbUddVeM2oLQhDs0HeII/JUhh/ NitVtmiwO4tsXotIOCx6yE8vouIMVxAVEx8SZ73mN5eUD7Fn3lSNlnFiOQyvWbzYDzcP bmLcbbD04SwK1sHpviNzf/yt9WXIUCmaT7j26g19cfCaqHNvZ/F0vbjTrccDn4D0+RLX 4EbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=xm0hfK1E3LB08nr3vwbz3PdGqQVIoP1uFwiwqIxvKus=; b=NenWJ2y2oqAC964H6firChUqfJ86d7vjzpxVoC4bzv7KfBIvXoh7PGBg6Qqnc0rztD lAUxLmB85AEjaJMiUy549OaWosJJAvIBZdHDnuDlzlpIa3aY3FxVm4/obNw8Z8+mw6aC KeQ/9p7HF2f6sUS/WA5Cd7RWpMzbRirIrs6bdSM09zpH95lpCPmLDyotr59zMAlduZhl t2anNIUlGK/szClKA9eoofAAM2ojb9wgkdlgtyXbHCJdSQu1GSqt/o3qQUrH78D7haCb q213QLShtz/gzDyBzNX98627uwfGZKkKIh0+IDa9gd1hjLGj7byz+9Bu132PnwauIcGF ygeg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i2si3353755edu.549.2020.07.30.07.43.28; Thu, 30 Jul 2020 07:43:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729694AbgG3OnQ (ORCPT + 99 others); Thu, 30 Jul 2020 10:43:16 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:53322 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728297AbgG3OnP (ORCPT ); Thu, 30 Jul 2020 10:43:15 -0400 Received: from ip5f5af08c.dynamic.kabel-deutschland.de ([95.90.240.140] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1k19ln-0001Uf-3p; Thu, 30 Jul 2020 14:42:55 +0000 Date: Thu, 30 Jul 2020 16:42:54 +0200 From: Christian Brauner To: "Eric W. Biederman" Cc: Kirill Tkhai , viro@zeniv.linux.org.uk, adobriyan@gmail.com, davem@davemloft.net, akpm@linux-foundation.org, areber@redhat.com, serge@hallyn.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 00/23] proc: Introduce /proc/namespaces/ directory to expose namespaces lineary Message-ID: <20200730144254.uabteale5tvtpkzp@wittgenstein> References: <159611007271.535980.15362304262237658692.stgit@localhost.localdomain> <87k0yl5axy.fsf@x220.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <87k0yl5axy.fsf@x220.int.ebiederm.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 30, 2020 at 09:34:01AM -0500, Eric W. Biederman wrote: > Kirill Tkhai writes: > > > Currently, there is no a way to list or iterate all or subset of namespaces > > in the system. Some namespaces are exposed in /proc/[pid]/ns/ directories, > > but some also may be as open files, which are not attached to a process. > > When a namespace open fd is sent over unix socket and then closed, it is > > impossible to know whether the namespace exists or not. > > > > Also, even if namespace is exposed as attached to a process or as open file, > > iteration over /proc/*/ns/* or /proc/*/fd/* namespaces is not fast, because > > this multiplies at tasks and fds number. > > I am very dubious about this. > > I have been avoiding exactly this kind of interface because it can > create rather fundamental problems with checkpoint restart. > > You do have some filtering and the filtering is not based on current. > Which is good. > > A view that is relative to a user namespace might be ok. It almost > certainly does better as it's own little filesystem than as an extension > to proc though. > > The big thing we want to ensure is that if you migrate you can restore > everything. I don't see how you will be able to restore these files > after migration. Anything like this without having a complete > checkpoint/restore story is a non-starter. > > Further by not going through the processes it looks like you are > bypassing the existing permission checks. Which has the potential > to allow someone to use a namespace who would not be able to otherwise. > > So I think this goes one step too far but I am willing to be persuaded > otherwise. I think we discussed this at Plumbers (last year I want to say?) and you were against making this a part of procfs already back then, I think. The last known idead we could agree on was debugfs (shudder). But a tiny separate fs might work as well. We really would want those introspection abilities this provides though. For us it was for debugging when namespaces linger and also to crawl and inspect namespaces from LXD and various other use-cases. So if we could make this happen in some form that'd be great. Thanks! Christian