Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp422694pxu; Sun, 22 Nov 2020 13:23:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJxttgyf0ATS6C79gP2BKWPmVW7a2TFmp/8cbrfb7vrYR71V6N83mbt+YXXLGpEbmbviiYxv X-Received: by 2002:a17:906:489a:: with SMTP id v26mr27767927ejq.422.1606080199703; Sun, 22 Nov 2020 13:23:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606080199; cv=none; d=google.com; s=arc-20160816; b=CLrDDjoKpvCbq4l0cvv/WKDPKLQk2kSnOXhKmSKWlPh7RIiu5NAa55OkjRnMEjS5V1 cNX8x0LL0Hwazc3/lRiGF0JOsgkSy2WCr/5aNKUnRD/vO9ysfVG12MV2TZ9ZGVuGPqVl eZP1Hbf5QCbrtGhOmLjrBwS1onblG0UElzWc0lhr+YTBGFcT3kbN9nR65AXC7tYvW6nq jJ9H2DdiILzuESMJpPZYjUsuCR51HYMKC6yuq3f5vvPSfZa1J/RYs6R9Uz6SlLsPKJbv c4V85P0V1TLP0d7h94OqdMzlOADMDMPg4XHj8bQvzld278+2dGFMGydt5HRR/nlDyoa3 JYKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=TnYAfYzK9z2CU6n9f6EQEQr3kUIpHz74aN1sscdy0dE=; b=yszt5r1mRe41w3lV0jQidd1rc6nSqoW52HIi0eFqKF1bBetlfjp9uJ264cJw/wj014 lL/ANApJLP5bB4SmO/q1BckvGfdFRPVnHrxIqwB5oth7HLmO4ZLkPvsivSMExaOtwjef X0ykiHriTa1LNUpVPacgA05IMrGAg1lqFNIhwJ15x6eYgOwNJI+AIimcTJzwWuDTNsb8 wrkqmDjoSc32TImAEoQUR9xrMPwyN8KwVkDqyar6ca7A2KUkmCDHVV71kc5KTXXadMGD BMaxVZ7TSCDpkXaN2KpZvusDWnUzBfNKdtebvTljWudOIiUcoJ10oO9RYDMLlkrXwHNT DWEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@paul-moore-com.20150623.gappssmtp.com header.s=20150623 header.b=SbzxRegB; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cy4si5444695edb.365.2020.11.22.13.22.50; Sun, 22 Nov 2020 13:23:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@paul-moore-com.20150623.gappssmtp.com header.s=20150623 header.b=SbzxRegB; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726158AbgKVVTK (ORCPT + 99 others); Sun, 22 Nov 2020 16:19:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725831AbgKVVTK (ORCPT ); Sun, 22 Nov 2020 16:19:10 -0500 Received: from mail-ed1-x541.google.com (mail-ed1-x541.google.com [IPv6:2a00:1450:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 983D7C0613D3 for ; Sun, 22 Nov 2020 13:19:08 -0800 (PST) Received: by mail-ed1-x541.google.com with SMTP id l5so15058590edq.11 for ; Sun, 22 Nov 2020 13:19:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paul-moore-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=TnYAfYzK9z2CU6n9f6EQEQr3kUIpHz74aN1sscdy0dE=; b=SbzxRegB0eYgWQjrzVNZhGj7xGAuKSyhCYUAI+WyaPpWKWhEN6iqWX+bEcJKbYoB0L e+h2FSiOvFHziiUa44whQFraYTHOU8p50iPMrz8231dDk9fhDLvFKe/Cjby4M/d+4U6r W07v+9gVUkUhjxH3Geo4u+yVOo02UkCH74LdJNvQKI4YCSBilG3kHI2Y62XjETJSZ/uX MlbS32EkSSCU1n+fb8kYyduz0/3kWYDqvMuc0IaCRimx25Yqqx0UR7QjDDjbfQt8oUSD 2DZTJmGkdogVeaLOCgp12rxjJJwbqtgcnvKaMV8RD8yIeQUYlTY9lypDDNNSf4zLZOfd JgQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=TnYAfYzK9z2CU6n9f6EQEQr3kUIpHz74aN1sscdy0dE=; b=SIUWAw1qsupBQv61+V+i8E47Pv14BxFgEu/QI1TqknOPjQSe69PkC4y+nrNKZKKR0M PPOfk6Jf2ZEtTFRY9zjts+RHZYH1KzgFreAHfWflxQXvkZnD7Q6VeTsaVjaFtY8fMySP 6qnQShsIv9G/BB6l20jNeZrYzxaKCaSw8kDwspyTzD/WvPfILdLa2esj5ozFCc7XCKBa JmK7z4wm36VTuyV5XVy3fUzt/ayOibhXGCWyDp/z6goRPN287QJiJgYNEwjGUsUJyskr 0VwqvoiO0d4MRIEwJbnhJCHcJkVduc3ojwHi3AmpMFLe1kimdqw62F1nfGkm3GNTGRXz YAww== X-Gm-Message-State: AOAM530SyOe55+UQ+UF7QXiQ+74bgeoyUIb3b08TbPnYsU2Ksdh1NtnK wlW2NKXOz9BR/7xdMUgcGIO3zmBzOLHDJPuRB8Cr X-Received: by 2002:aa7:de01:: with SMTP id h1mr44059484edv.269.1606079946987; Sun, 22 Nov 2020 13:19:06 -0800 (PST) MIME-Version: 1.0 References: <20201115103718.298186-1-christian.brauner@ubuntu.com> <20201115103718.298186-15-christian.brauner@ubuntu.com> In-Reply-To: <20201115103718.298186-15-christian.brauner@ubuntu.com> From: Paul Moore Date: Sun, 22 Nov 2020 16:18:55 -0500 Message-ID: Subject: Re: [PATCH v2 14/39] commoncap: handle idmapped mounts To: Christian Brauner Cc: Alexander Viro , Christoph Hellwig , linux-fsdevel@vger.kernel.org, John Johansen , James Morris , Mimi Zohar , Dmitry Kasatkin , Stephen Smalley , Casey Schaufler , Arnd Bergmann , Andreas Dilger , OGAWA Hirofumi , Geoffrey Thomas , Mrunal Patel , Josh Triplett , Andy Lutomirski , Theodore Tso , Alban Crequy , Tycho Andersen , David Howells , James Bottomley , Jann Horn , Seth Forshee , =?UTF-8?Q?St=C3=A9phane_Graber?= , Aleksa Sarai , Lennart Poettering , "Eric W. Biederman" , smbarber@chromium.org, Phil Estes , Serge Hallyn , Kees Cook , Todd Kjos , Jonathan Corbet , containers@lists.linux-foundation.org, linux-security-module@vger.kernel.org, linux-api@vger.kernel.org, linux-ext4@vger.kernel.org, linux-audit@redhat.com, linux-integrity@vger.kernel.org, selinux@vger.kernel.org, Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Sun, Nov 15, 2020 at 5:39 AM Christian Brauner wrote: > When interacting with user namespace and non-user namespace aware > filesystem capabilities the vfs will perform various security checks to > determine whether or not the filesystem capabilities can be used by the > caller (e.g. during exec), or even whether they need to be removed. The > main infrastructure for this resides in the capability codepaths but they > are called through the LSM security infrastructure even though they are not > technically an LSM or optional. This extends the existing security hooks > security_inode_removexattr(), security_inode_killpriv(), > security_inode_getsecurity() to pass down the mount's user namespace and > makes them aware of idmapped mounts. > In order to actually get filesystem capabilities from disk the capability > infrastructure exposes the get_vfs_caps_from_disk() helper. For user > namespace aware filesystem capabilities a root uid is stored alongside the > capabilities. > In order to determine whether the caller can make use of the filesystem > capability or whether it needs to be ignored it is translated according to > the superblock's user namespace. If it can be translated to uid 0 according > to that id mapping the caller can use the filesystem capabilities stored on > disk. If we are accessing the inode that holds the filesystem capabilities > through an idmapped mount we need to map the root uid according to the > mount's user namespace. > Afterwards the checks are identical to non-idmapped mounts. Reading > filesystem caps from disk enforces that the root uid associated with the > filesystem capability must have a mapping in the superblock's user > namespace and that the caller is either in the same user namespace or is a > descendant of the superblock's user namespace. For filesystems that are > mountable inside user namespace the container can just mount the filesystem > and won't usually need to idmap it. If it does create an idmapped mount it > can mark it with a user namespace it has created and which is therefore a > descendant of the s_user_ns. For filesystems that are not mountable inside > user namespaces the descendant rule is trivially true because the s_user_ns > will be the initial user namespace. > > If the initial user namespace is passed all operations are a nop so > non-idmapped mounts will not see a change in behavior and will also not see > any performance impact. > > Cc: Christoph Hellwig > Cc: David Howells > Cc: Al Viro > Cc: linux-fsdevel@vger.kernel.org > Signed-off-by: Christian Brauner ... > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index 8dba8f0983b5..ddb9213a3e81 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1944,7 +1944,7 @@ static inline int audit_copy_fcaps(struct audit_names *name, > if (!dentry) > return 0; > > - rc = get_vfs_caps_from_disk(dentry, &caps); > + rc = get_vfs_caps_from_disk(&init_user_ns, dentry, &caps); > if (rc) > return rc; > > @@ -2495,7 +2495,8 @@ int __audit_log_bprm_fcaps(struct linux_binprm *bprm, > ax->d.next = context->aux; > context->aux = (void *)ax; > > - get_vfs_caps_from_disk(bprm->file->f_path.dentry, &vcaps); > + get_vfs_caps_from_disk(mnt_user_ns(bprm->file->f_path.mnt), > + bprm->file->f_path.dentry, &vcaps); As audit currently records information in the context of the initial/host namespace I'm guessing we don't want the mnt_user_ns() call above; it seems like &init_user_ns would be the right choice (similar to audit_copy_fcaps()), yes? > ax->fcap.permitted = vcaps.permitted; > ax->fcap.inheritable = vcaps.inheritable; -- paul moore www.paul-moore.com