Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp6379953ybv; Tue, 18 Feb 2020 15:51:31 -0800 (PST) X-Google-Smtp-Source: APXvYqzM5CMgJIdpwnyXTeu8Q6Fh4F7RPm4ewL6bNeV+Mdv5gUE0GWhEV981lma/kr8rkbxjYSgk X-Received: by 2002:a9d:7342:: with SMTP id l2mr17900663otk.98.1582069890896; Tue, 18 Feb 2020 15:51:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582069890; cv=none; d=google.com; s=arc-20160816; b=yQvsNwdsC6YjZZ/R5ZzLHslXoxOjTFn0YI+p0nwQOjv8LDTVK2jD5AlO+HOoyavClt hWwIqRguIi0vXuCbUbIZtNekEiGduHDvYXopvItV7Rr2MQCtwnxJKGxF6gfFHtl8W4no 6EaJB8hBYpoiQQS8tEWtc+9tfFLFiiktth6hOuwTbaKHbQ3nDxVhQl6a74ZARN5B7yAF lrwJJcVC4qke69ODQO1imLhxZT8Th/Z5VVieU5LoWHUrrNsPo/SA3A8o9jCHw0kSq2Rf cLFvIhQDm+vNwwKh+nxeSpNmV+8Dc0MXn+vx3l9Cx686thX1zv0ysmKzzA/xOtjbXJUL qS2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :dkim-signature:dkim-signature; bh=V+Q+GK7+A/bDMgDEl2SJJytYNlb/sSQmAp5OviOHeyQ=; b=nMIrk8G8QFiqxAuhIi4Gy8jc8RmDsMjifd/W0aAuy+Cd6MH4M6h1uG+ExPO9WFj9i+ SYRtEtHAyfsXegSlP7t7yyxZGud+/kzb7LzSITd4tgGxOv3OX+Gm6QiiU6ntFup4fqVM VALv7eYPZoThNrD9w0UEnoh3RqW/CuhSSucqpBUao4ms5GvBRFUJ70PFOmo280Q1NMyT fjy+A/VjD1I4JCAezigy29t7HK+Z0v8/cg4Zz5bLx7qjFxGgQXMrN/crflSb2NoUsCeg SAa9CwwxA7kj9v+QRY9/EGEmwuHqgfRKlzMEKURYZEmnoO8edo7EArc0Ki0kg4E+Jpy5 ixaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b=KhPrEaLp; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b=KhPrEaLp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hansenpartnership.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w18si198034otp.48.2020.02.18.15.51.19; Tue, 18 Feb 2020 15:51:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b=KhPrEaLp; dkim=fail header.i=@hansenpartnership.com header.s=20151216 header.b=KhPrEaLp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=hansenpartnership.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726962AbgBRXvA (ORCPT + 99 others); Tue, 18 Feb 2020 18:51:00 -0500 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:35564 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726716AbgBRXu7 (ORCPT ); Tue, 18 Feb 2020 18:50:59 -0500 Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id CECF68EE367; Tue, 18 Feb 2020 15:50:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1582069858; bh=uAmYYHSWRiJ7Jt65f/eBWdihBlK1KFIAAWXVc9m7yt0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=KhPrEaLpdk5FFgsYKEuwFY/rqCQlUMnKPGm2py/twY0ZtW2iIguWNWc8WFKNn2a8s 5anURAu22KldxiQZCLKx46Rfz58z1zsxEFuyiy1CPDao4Ru/8YS+lSURC0sO/oUk7S yIehdox1DGMePvRq0uPTOqchbx7i1wAEhSQw2u0U= Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ilGQnuwfvJXX; Tue, 18 Feb 2020 15:50:58 -0800 (PST) Received: from jarvis.ext.hansenpartnership.com (jarvis.ext.hansenpartnership.com [153.66.160.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id 73D138EE0D5; Tue, 18 Feb 2020 15:50:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=hansenpartnership.com; s=20151216; t=1582069858; bh=uAmYYHSWRiJ7Jt65f/eBWdihBlK1KFIAAWXVc9m7yt0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=KhPrEaLpdk5FFgsYKEuwFY/rqCQlUMnKPGm2py/twY0ZtW2iIguWNWc8WFKNn2a8s 5anURAu22KldxiQZCLKx46Rfz58z1zsxEFuyiy1CPDao4Ru/8YS+lSURC0sO/oUk7S yIehdox1DGMePvRq0uPTOqchbx7i1wAEhSQw2u0U= Message-ID: <1582069856.16681.59.camel@HansenPartnership.com> Subject: Re: [PATCH v3 00/25] user_namespace: introduce fsid mappings From: James Bottomley To: Christian Brauner , =?ISO-8859-1?Q?St=E9phane?= Graber , "Eric W. Biederman" , Aleksa Sarai , Jann Horn Cc: Kees Cook , Jonathan Corbet , linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, smbarber@chromium.org, Seth Forshee , linux-security-module@vger.kernel.org, Alexander Viro , linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, Alexey Dobriyan Date: Tue, 18 Feb 2020 15:50:56 -0800 In-Reply-To: <20200218143411.2389182-1-christian.brauner@ubuntu.com> References: <20200218143411.2389182-1-christian.brauner@ubuntu.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.26.6 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2020-02-18 at 15:33 +0100, Christian Brauner wrote: > In the usual case of running an unprivileged container we will have > setup an id mapping, e.g. 0 100000 100000. The on-disk mapping will > correspond to this id mapping, i.e. all files which we want to appear > as 0:0 inside the user namespace will be chowned to 100000:100000 on > the host. This works, because whenever the kernel needs to do a > filesystem access it will lookup the corresponding uid and gid in the > idmapping tables of the container. Now think about the case where we > want to have an id mapping of 0 100000 100000 but an on-disk mapping > of 0 300000 100000 which is needed to e.g. share a single on-disk > mapping with multiple containers that all have different id mappings. > This will be problematic. Whenever a filesystem access is requested, > the kernel will now try to lookup a mapping for 300000 in the id > mapping tables of the user namespace but since there is none the > files will appear to be owned by the overflow id, i.e. usually > 65534:65534 or nobody:nogroup. > > With fsid mappings we can solve this by writing an id mapping of 0 > 100000 100000 and an fsid mapping of 0 300000 100000. On filesystem > access the kernel will now lookup the mapping for 300000 in the fsid > mapping tables of the user namespace. And since such a mapping > exists, the corresponding files will have correct ownership. So I did compile this up in order to run the shiftfs tests over it to see how it coped with the various corner cases. However, what I find is it simply fails the fsid reverse mapping in the setup. Trying to use a simple uid of 0 100000 1000 and a fsid of 100000 0 1000 fails the entry setuid(0) call because of this code: long __sys_setuid(uid_t uid) { struct user_namespace *ns = current_user_ns(); const struct cred *old; struct cred *new; int retval; kuid_t kuid; kuid_t kfsuid; kuid = make_kuid(ns, uid); if (!uid_valid(kuid)) return -EINVAL; kfsuid = make_kfsuid(ns, uid); if (!uid_valid(kfsuid)) return -EINVAL; which means you can't have a fsid mapping that doesn't have the same domain as the uid mapping, meaning a reverse mapping isn't possible because the range and domain have to be inverse and disjoint. James