Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933899AbcCITFh (ORCPT ); Wed, 9 Mar 2016 14:05:37 -0500 Received: from mail-qk0-f171.google.com ([209.85.220.171]:34396 "EHLO mail-qk0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753876AbcCITF0 (ORCPT ); Wed, 9 Mar 2016 14:05:26 -0500 Subject: Re: Thoughts on tightening up user namespace creation To: Colin Walters , Kees Cook , Andy Lutomirski References: <1457549467.650797.544465346.49653120@webmail.messagingengine.com> Cc: linux-kernel@vger.kernel.org, "Eric W. Biederman" , Linux Containers , Alexander Larsson , Serge Hallyn , Stephane Graber , Seth Forshee From: "Austin S. Hemmelgarn" Message-ID: <56E073BA.3000009@gmail.com> Date: Wed, 9 Mar 2016 14:04:26 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <1457549467.650797.544465346.49653120@webmail.messagingengine.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Antivirus: avast! (VPS 160309-0, 2016-03-09), Outbound message X-Antivirus-Status: Clean Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2184 Lines: 48 On 2016-03-09 13:51, Colin Walters wrote: > On Wed, Mar 9, 2016, at 01:14 PM, Kees Cook wrote: >> On Mon, Mar 7, 2016 at 9:15 PM, Andy Lutomirski wrote: >>> Hi all- >>> >>> There are several users and distros that are nervous about user >>> namespaces from an attack surface point of view. >>> >>> - RHEL and Arch have userns disabled. >>> >>> - Ubuntu requires CAP_SYS_ADMIN >>> >>> - Kees periodically proposes to upstream some sysctl to control >>> userns creation. >> >> And here's another ring0 escalation flaw, made available to >> unprivileged users because of userns: >> >> https://code.google.com/p/google-security-research/issues/detail?id=758 > > Looks like Andy won't have to eat his hat ;) > >> The change in attack surface is _substantial_. We must have a way to >> globally disable userns. > > No one would object if it was enabled but only accessible to > CAP_SYS_ADMIN though, right? This could be useful for > writing setuid binaries that expose some of the features, but e.g. not > CAP_NET_ADMIN. At least Google Chrome (and probably Chromium) is using user namespaces without CAP_SYS_ADMIM (although AFAIUI, it's because they can't use the other namespace types effectively as a regular user). > > Andy's suggestion of having this be a per-namespace setting makes > sense to me. Currently some container tools that do use userns > are by default denying it to be recursive (Sandstorm.io and Docker 1.10 at least) > by using a seccomp filter on clone(). If we had this setting that > filter wouldn't be necessary, and would solve the issue that seccomp filters > aren't robust against the kernel adding new API, e.g. a new CLONE_NEWUSER_NONEWPRIVS > which might enable chroot() but not CAP_NET_ADMIN. > Personally, I like the suggestion from Alexander Larsson to make a cgroup controller. Container tools obviously want some degree of hierarchical control (even if it's just saying that the hierarchy ends here), and it would simplify the possibility of running more than one container stack on the same host (I know at least a couple people who would love to be able to safely use Docker on the same host as LXC or lmctfy).