Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751300AbcCFI2a (ORCPT ); Sun, 6 Mar 2016 03:28:30 -0500 Received: from h2.hallyn.com ([78.46.35.8]:41696 "EHLO h2.hallyn.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751082AbcCFI2X (ORCPT ); Sun, 6 Mar 2016 03:28:23 -0500 Date: Sun, 6 Mar 2016 02:28:20 -0600 From: "Serge E. Hallyn" To: "Eric W. Biederman" , lkml , Seth Forshee , =?iso-8859-1?Q?St=E9phane?= Graber , serge@hallyn.com, Andy Lutomirski Subject: user namespace and fully visible proc and sys mounts Message-ID: <20160306082820.GA1917@mail.hallyn.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 955 Lines: 22 Hi, So we've been over this many times... but unfortunately there is more breakage to report. Regular privileged and unprivileged containers work all right for us. But running an unprivileged container inside a privileged container is blocked. When creating privileged containers, lxc by default does a few things: it mounts some fuse.lxcfs files over procfiles include /proc/meminfo and /proc/uptime. It mounts proc rw but /proc/sysrq-trigger ro as well as moves /proc/sys/net out of the way, bind-mounts /proc/sys readonly (because this container is not in a user namespace) then moves /proc/sys/net back. Finally it mounts sys ro but bind-mounts /sys/devices/virtual/net as writeable. If any of these are left enabled, unprivileged containers can't be started. If all are disabled, then they can be. Can we find a way to make these not block remounts in child user namespaces? A boot flag, a procfs and sysfs mount option, a sysctl? -serge