Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2882814imm; Sun, 30 Sep 2018 07:02:40 -0700 (PDT) X-Google-Smtp-Source: ACcGV61VHcUeseV9iLHb6OQUOqRmOHtjquBolhKlPDhyMmgoZ6FjC2514Bjp5k6PcSIUMdmX3HMl X-Received: by 2002:a17:902:167:: with SMTP id 94-v6mr7453874plb.142.1538316160823; Sun, 30 Sep 2018 07:02:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538316160; cv=none; d=google.com; s=arc-20160816; b=zWC1j7d4ReAwDXetNrYj5z8RISHc8CVR2kmwYq9hzeTwvnrepYDHNNCMpd49isj8Nr FYK7fW2AJ1M7pkPCNKSlaMZ2DJA0At65zTnlux3LmKd/8noxwjPPgsBkOqfx8C2I1NAt o6BZGlquADERLxVGIX50yuxsO+P5Z9OAgCYT1oVUotu22RyUbcSpjq8cwfcNQNuYlsmW kTJtWaMnd/9JCG/GlvOGqCUiUvxVxH7ip2uX0n2A1UjQQelWaVYMYEJFvP6k3KZYBtSz mQuysTnWlB2asrS0dMQoEUe8sOqehgiVj7FgVSYOWwFnbl8k9hPyPBatOJUFVYWcRjb6 bsXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:from:cc:to:subject :content-transfer-encoding:mime-version:references:in-reply-to :user-agent:date:dkim-signature; bh=MRmKKe7Tq+zvMdCS+fKbwZy87UjT4pYSDJTPo9vgYAI=; b=pHXTOLRbvLvKGedi3nlpAogN/E7jdncCZkXhfWnWgkX5Y6Rm+ZDl6QwZtCXeBRsgLT t0HdwyVoURTmTYQic3tHaHCXLwQ6X2SytNm65UdvocpnH9xowdEj75mbpDFqE5jwsm7w ZD3r/e6kB6rXjwS64i1z7dhRkhRieQm/RhL9yD9EBp2NK9XU2JNTdo7MRimrNtFhCpP9 0F+3usEeOfJZ4Ua5Zhh5Teu1/cQ+9teuCSzFiO+cz1WjPC8vLNNSa5dzAg+3ZFoxlmPZ 4yUZOiiAqlmXiSZOUhF3bvkv7FIP+9TV9ARCbC0sVc+K2YprZxC4sy8S+zLdUk4xDoZg qWow== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@brauner.io header.s=google header.b=bIer76FY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x27-v6si10664317pff.196.2018.09.30.07.02.25; Sun, 30 Sep 2018 07:02:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@brauner.io header.s=google header.b=bIer76FY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728548AbeI3Uf3 (ORCPT + 99 others); Sun, 30 Sep 2018 16:35:29 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:36952 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728491AbeI3Uf3 (ORCPT ); Sun, 30 Sep 2018 16:35:29 -0400 Received: by mail-pf1-f195.google.com with SMTP id x26-v6so7347407pfn.4 for ; Sun, 30 Sep 2018 07:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brauner.io; s=google; h=date:user-agent:in-reply-to:references:mime-version :content-transfer-encoding:subject:to:cc:from:message-id; bh=MRmKKe7Tq+zvMdCS+fKbwZy87UjT4pYSDJTPo9vgYAI=; b=bIer76FYCs9fAC+GEx2MIDlc956Q1/B36Qf9EvbrBE6y0qVijXTfOM0pUKjnoozIbA adU3myAbXiRg3kj8m4Shu4+SanFBjTyz5N0KeGk8ci76R7ZvxORxj5faAcLAeCrHG3nj uqOyoYXg4FWspqKNVumQTKF93AJpwd0wZTtEnjS8nCJbgrSY5n67OgK818jiTmUkdj3n /2l5+0tT0rTGp0O66uw/SmbPuTHrjwxxGWQpSZYZ38zh4/pDa08cteneakPwDZn8UcBN pPwU5pvjZ9z43eyHv+9fWbQwvHoTQPo01LrHWgn0Ab05PhV17r8mQer8+9BhPcUOidDr 8Eng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:user-agent:in-reply-to:references :mime-version:content-transfer-encoding:subject:to:cc:from :message-id; bh=MRmKKe7Tq+zvMdCS+fKbwZy87UjT4pYSDJTPo9vgYAI=; b=dN29b1NQfMrUmHh0+Nt3mr9bMXcRgUOp7PCWe+h9+gXDeEtbEmCHoNgAkaPkeQxbGi XdMNN9x5gjt1spZi9FGlnRTgcKPE9zttQQfRuuKKsIMDcRKGyOHvLsJ25KmyIp4hKjLP Od7dowspG9ZlGU/YxEKbGvK3x8Qg6L9Y5V6/BQHUceHiIiVjwxabhl9xl6lFyHwcUClK qphGAyhlTOxDOnZtcRa7eWKJa4RFuTnJIEDGJUhGinhsIHNUQGEh7rjhObnKiiHglud1 HyJoBrEzato1iDfVr14p0juN6QBdDre5if+jcMEyfy/s3c8CEGJ7N5NqrZc2zygLNYek tklA== X-Gm-Message-State: ABuFfohYYjPz794f3VxXbiZYSpmk9y8jAeGS62eE+NdSEzuhzPhPCCYL +pa+J75JbqF3JZ0mHjB+cVnWjQ== X-Received: by 2002:a63:da17:: with SMTP id c23-v6mr6642715pgh.23.1538316138888; Sun, 30 Sep 2018 07:02:18 -0700 (PDT) Received: from ?IPv6:2607:fb90:690:c6f7:8979:6927:b76:6728? ([2607:fb90:690:c6f7:8979:6927:b76:6728]) by smtp.gmail.com with ESMTPSA id i21-v6sm18182333pgj.55.2018.09.30.07.02.17 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 30 Sep 2018 07:02:17 -0700 (PDT) Date: Sun, 30 Sep 2018 16:02:10 +0200 User-Agent: K-9 Mail for Android In-Reply-To: References: <20180929103453.12025-1-cyphar@cyphar.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PATCH 0/3] namei: implement various scoping AT_* flags To: Alban Crequy , cyphar@cyphar.com CC: jlayton@kernel.org, bfields@fieldses.org, Alexander Viro , arnd@arndb.de, shuah@kernel.org, dhowells@redhat.com, luto@kernel.org, "Eric W . Biederman" , tycho@tycho.ws, LKML , linux-fsdevel , linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, dev , Linux Containers , jsafrane@redhat.com, msau@google.com From: Christian Brauner Message-ID: <58BB23FF-E652-4C58-AEE4-4B5376D03BF0@brauner.io> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On September 30, 2018 3:54:31 PM GMT+02:00, Alban Crequy wrote: >On Sat, Sep 29, 2018 at 12:35 PM Aleksa Sarai >wrote: >> >> The need for some sort of control over VFS's path resolution (to >avoid >> malicious paths resulting in inadvertent breakouts) has been a very >> long-standing desire of many userspace applications=2E This patchset is >a >> revival of Al Viro's old AT_NO_JUMPS[1] patchset with a few >additions=2E >> >> The most obvious change is that AT_NO_JUMPS has been split as >dicussed >> in the original thread, along with a further split of AT_NO_PROCLINKS >> which means that each individual property of AT_NO_JUMPS is now a >> separate flag: >> >> * Path-based escapes from the starting-point using "/" or "=2E=2E" ar= e >> blocked by AT_BENEATH=2E >> * Mountpoint crossings are blocked by AT_XDEV=2E >> * /proc/$pid/fd/$fd resolution is blocked by AT_NO_PROCLINKS (more >> correctly it actually blocks any user of nd_jump_link() >because it >> allows out-of-VFS path resolution manipulation)=2E >> >> AT_NO_JUMPS is now effectively (AT_BENEATH|AT_XDEV|AT_NO_PROCLINKS)=2E >At >> Linus' suggestion in the original thread, I've also implemented >> AT_NO_SYMLINKS which just denies _all_ symlink resolution (including >> "proclink" resolution)=2E > >It seems quite useful to me=2E > >> An additional improvement was made to AT_XDEV=2E The original >AT_NO_JUMPS >> path didn't consider "/tmp/=2E=2E" as a mountpoint crossing -- this pat= ch >> blocks this as well (feel free to ask me to remove it if you feel >this >> is not sane)=2E >> >> Currently I've only enabled these for openat(2) and the stat(2) >family=2E >> I would hope we could enable it for basically every *at(2) syscall -- >> but many of them appear to not have a @flags argument and thus we'll >> need to add several new syscalls to do this=2E I'm more than happy to >send >> those patches, but I'd prefer to know that this preliminary work is >> acceptable before doing a bunch of copy-paste to add new sets of >*at(2) >> syscalls=2E > >What do you think of an equivalent feature AT_NO_SYMLINKS flag for >mount()? That's something we discussed but that would need to be part of the new m= ount API work by David=2E The current mount API doesn't take AT_* flags sin= ce it doesn't operate on fds and we're (sort of) out of mount flags=2E > >I guess that would have made the fix for CVE-2017-1002101 in >Kubernetes easier to write: >https://kubernetes=2Eio/blog/2018/04/04/fixing-subpath-volume-vulnerabili= ty/ > >> One additional feature I've implemented is AT_THIS_ROOT (I imagine >this >> is probably going to be more contentious than the refresh of >> AT_NO_JUMPS, so I've included it in a separate patch)=2E The patch >itself >> describes my reasoning, but the shortened version of the premise is >that >> continer runtimes need to have a way to resolve paths within a >> potentially malicious rootfs=2E Container runtimes currently do this in >> userspace[2] which has implicit race conditions that are not >resolvable >> in userspace (or use fork+exec+chroot and SCM_RIGHTS passing which is >> inefficient)=2E AT_THIS_ROOT allows for per-call chroot-like semantics >for >> path resolution, which would be invaluable for us -- and the >> implementation is basically identical to AT_BENEATH (except that we >> don't return errors when someone actually hits the root)=2E >> >> I've added some selftests for this, but it's not clear to me whether >> they should live here or in xfstests (as far as I can tell there are >no >> other VFS tests in selftests, while there are some tests that look >like >> generic VFS tests in xfstests)=2E If you'd prefer them to be included >in >> xfstests, let me know=2E >> >> [1]: https://lore=2Ekernel=2Eorg/patchwork/patch/784221/ >> [2]: https://github=2Ecom/cyphar/filepath-securejoin >> >> Aleksa Sarai (3): >> namei: implement O_BENEATH-style AT_* flags >> namei: implement AT_THIS_ROOT chroot-like path resolution >> selftests: vfs: add AT_* path resolution tests >> >> fs/fcntl=2Ec | 2 +- >> fs/namei=2Ec | 158 >++++++++++++------ >> fs/open=2Ec | 10 ++ >> fs/stat=2Ec | 15 +- >> include/linux/fcntl=2Eh | 3 +- >> include/linux/namei=2Eh | 8 + >> include/uapi/asm-generic/fcntl=2Eh | 20 +++ >> include/uapi/linux/fcntl=2Eh | 10 ++ >> tools/testing/selftests/Makefile | 1 + >> tools/testing/selftests/vfs/=2Egitignore | 1 + >> tools/testing/selftests/vfs/Makefile | 13 ++ >> tools/testing/selftests/vfs/at_flags=2Eh | 40 +++++ >> tools/testing/selftests/vfs/common=2Esh | 37 ++++ >> =2E=2E=2E/selftests/vfs/tests/0001_at_beneath=2Esh | 72 ++++++++ >> =2E=2E=2E/selftests/vfs/tests/0002_at_xdev=2Esh | 54 ++++++ >> =2E=2E=2E/vfs/tests/0003_at_no_proclinks=2Esh | 50 ++++++ >> =2E=2E=2E/vfs/tests/0004_at_no_symlinks=2Esh | 49 ++++++ >> =2E=2E=2E/selftests/vfs/tests/0005_at_this_root=2Esh | 66 ++++++++ >> tools/testing/selftests/vfs/vfs_helper=2Ec | 154 >+++++++++++++++++ >> 19 files changed, 707 insertions(+), 56 deletions(-) >> create mode 100644 tools/testing/selftests/vfs/=2Egitignore >> create mode 100644 tools/testing/selftests/vfs/Makefile >> create mode 100644 tools/testing/selftests/vfs/at_flags=2Eh >> create mode 100644 tools/testing/selftests/vfs/common=2Esh >> create mode 100755 >tools/testing/selftests/vfs/tests/0001_at_beneath=2Esh >> create mode 100755 tools/testing/selftests/vfs/tests/0002_at_xdev=2Esh >> create mode 100755 >tools/testing/selftests/vfs/tests/0003_at_no_proclinks=2Esh >> create mode 100755 >tools/testing/selftests/vfs/tests/0004_at_no_symlinks=2Esh >> create mode 100755 >tools/testing/selftests/vfs/tests/0005_at_this_root=2Esh >> create mode 100644 tools/testing/selftests/vfs/vfs_helper=2Ec >> >> -- >> 2=2E19=2E0