Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753945AbcCHFPv (ORCPT ); Tue, 8 Mar 2016 00:15:51 -0500 Received: from mail-oi0-f42.google.com ([209.85.218.42]:34832 "EHLO mail-oi0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751308AbcCHFPp (ORCPT ); Tue, 8 Mar 2016 00:15:45 -0500 MIME-Version: 1.0 From: Andy Lutomirski Date: Mon, 7 Mar 2016 21:15:25 -0800 Message-ID: Subject: Thoughts on tightening up user namespace creation To: "linux-kernel@vger.kernel.org" , "Eric W. Biederman" , Linux Containers , Alexander Larsson , Colin Walters , Serge Hallyn , Stephane Graber , Kees Cook , Seth Forshee Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2827 Lines: 70 Hi all- There are several users and distros that are nervous about user namespaces from an attack surface point of view. - RHEL and Arch have userns disabled. - Ubuntu requires CAP_SYS_ADMIN - Kees periodically proposes to upstream some sysctl to control userns creation. I think there are three main types of concerns. First, there might be some as-yet-unknown semantic issues that would allow privilege escalation by users who create user namespaces and then confuse something else in the system. Second, enabling user namespaces exposes a lot of attack surface to unprivileged users. Third, allowing tasks to create user namespaces exposes the kernel to various resource exhaustion attacks that wouldn't be possible otherwise. Since I doubt we'll ever fully address the attack surface issue at least, would it make sense to try to come up with an upstreamable way to limit who can create new user namespaces and/or do various dangerous things with them? I'll divide the rest of the email into the "what" and the "who". +++ What does the privilege of creating a user namespace entail? +++ This could be an all-or-nothing thing. It would certainly be possible for appropriately privileged tasks to be able to unshare namespaces and use their facilities exactly like any task can in a current user-ns-enabled kernel and for other tasks to be unable to unshare anything. Finer gradations are, in principle, possible. For example, it could be possible for a given task to unshare its userns but to have limited caps inside or to be unable to unshare certain other namespaces. For example, maybe a task could unshare userns and mount ns but not net ns. I don't think this would be particularly useful. It might be more interesting to allow a task to unshare all namespaces, hold all capabilities in them, but to still be unable to use certain privileged facilities. For example, maybe denying administrative control over iptables, creation of exotic network interface types, or similar would make sense. I don't know how we'd specify this type of constraint. +++ Who can create user namespaces (possibly with restrictions)? +++ I can think of a few formulations. A simpler approach would be to add a per-namespace setting listing users and/or groups that can unshare their userns. A userns starts out allowing everyone to unshare userns, and anyone with CAP_SYS_ADMIN can change the setting. A fancier approach would be to have an fd that represents the right to unshare your userns. Some privilege broker could give out those fds to apps that need them and meet whatever criteria are set. If you try to unshare your userns without the fd, it falls back to some simpler policy. I think I prefer the simpler one. It's simple, and I haven't come up with a concrete problem with it yet. Thoughts?