Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp184435imm; Thu, 11 Oct 2018 18:13:11 -0700 (PDT) X-Google-Smtp-Source: ACcGV63k5nLMZce78pWPqakKnYecfgJ1+uAPMaIqKCrKZ8udGY4X2Hop2gn+rwyXCj2IULtdBDOA X-Received: by 2002:a63:1d10:: with SMTP id d16-v6mr3411901pgd.228.1539306791551; Thu, 11 Oct 2018 18:13:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539306791; cv=none; d=google.com; s=arc-20160816; b=SnwV0uBkg5lRvtj/mylyreOb0MwaL3kV+LIQsTR4qWM4UrtGWzVvzmfbzxLbPTXSPH 4BlvgKY6PUkWuejPpfA6Z+ghapL/aHgaeLfCQPjBbiZPrnjdd4g3e8ZR9D4kU4YSv6BZ PPeE2UxZkpGkI9T2hKuGEeCVZMcijF5i36b+7UmUxGlp7TaVoxXFn3xvivfpkj2AHfdD Z8svr5rAacI2Xe/5PyCWodaCnDLhy7rBOqfVyoyFIaTxewqMmEepwp/o9kMGXUqE0ZvR 9o01GU5BxfkqMJJHwmAxOz2qkRphqQ/r0WXYxqmynWFGefOnA4ESGm/WYT6TBmdSRDo/ JL0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=P/XXJtDNfFWgdx4ZfW3wzPLw8toxFz/u+qiTCY2GSjw=; b=MZt9niSpKq7nJSbqRDqA5bASMMj0ub07kD7xSKOs+IgXJN0yLOBqqjbXK3nab2ZAK9 OQgjTzl9iSQHUP4645yAY72PwPDjB6z46Fo8EydrG3nVz/fYDw03QDKAZHx4cQiC2wgX vnk3qHL/Tii6HLvOB11c14eZXcjaP/mdgCSzCEbiIXMyMrUWtfwpJiaIUUlSBYRdvnOb hsoen/plROhuLf9sXAQk1ewfYIZTp/M7D8TD0hgJjHADj+dHo7NBCVc1jTvZqKpR1dtA 7hP2SIF6yTHcqCfJJJJHrdorIaaQUV/V32GIR8XC86ipyRo9Cj/0BOehvLWSiP4SLy1d rvAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=nJCZNVVj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o24-v6si24322443pgv.242.2018.10.11.18.12.50; Thu, 11 Oct 2018 18:13:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=nJCZNVVj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726923AbeJLImI (ORCPT + 99 others); Fri, 12 Oct 2018 04:42:08 -0400 Received: from mail.kernel.org ([198.145.29.99]:48886 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726061AbeJLImH (ORCPT ); Fri, 12 Oct 2018 04:42:07 -0400 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6E6DC213A2 for ; Fri, 12 Oct 2018 01:12:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1539306734; bh=1PA4k92Ei1i11cqZCfZvT70a1VCx/S3BpQ5IUfKNjxA=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=nJCZNVVj0afsTPmP53H/IjReGacwUcaFJ2D/jNC2TW7Ju3mdZuYXRP1IOI4C0dq6z z2emt3vL5PJAo0LBw50vsfTVNy/tTsGbPV01n8eWm8DIVwy5XbiQejlHxn8SR6w/cH F6/K7BSyiYtSkrxLR44dQDubBlW7ZWRexjFrWcaw= Received: by mail-wr1-f45.google.com with SMTP id e4-v6so11630472wrs.0 for ; Thu, 11 Oct 2018 18:12:14 -0700 (PDT) X-Gm-Message-State: ABuFfoiZNkKkmT87uq8voKj2q/dYHGNoIRRCchTeBXMvSnjEGjVc0D7l dq+SrOTP7CfckQoqKn5uJEt0oonxG82sS9p8GuA2CA== X-Received: by 2002:adf:b188:: with SMTP id q8-v6mr3454358wra.95.1539306732837; Thu, 11 Oct 2018 18:12:12 -0700 (PDT) MIME-Version: 1.0 References: <20181009065300.11053-1-cyphar@cyphar.com> <20181009065300.11053-3-cyphar@cyphar.com> <20181010070747.byi2itbi4j42gynq@ryuk> In-Reply-To: <20181010070747.byi2itbi4j42gynq@ryuk> From: Andy Lutomirski Date: Thu, 11 Oct 2018 18:12:01 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v2 1/3] namei: implement O_BENEATH-style AT_* flags To: Aleksa Sarai Cc: Andrew Lutomirski , Al Viro , "Eric W. Biederman" , Christian Brauner , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Jann Horn , Tycho Andersen , David Drysdale , dev@opencontainers.org, Linux Containers , Linux FS Devel , LKML , linux-arch , Linux API Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 10, 2018 at 12:08 AM Aleksa Sarai wrote: > > On 2018-10-09, Andy Lutomirski wrote: > > On Mon, Oct 8, 2018 at 11:53 PM Aleksa Sarai wrote: > > > * AT_NO_PROCLINK: Disallows ->get_link "symlink" jumping. This is a very > > > specific restriction, and it exists because /proc/$pid/fd/... > > > "symlinks" allow for access outside nd->root and pose risk to > > > container runtimes that don't want to be tricked into accessing a host > > > path (but do want to allow no-funny-business symlink resolution). > > > > Can you elaborate on the use case? > > > > If I'm set up a container namespace and walk it for real (through the > > outside /proc/PID/root or otherwise starting from an fd that points > > into that namespace), and I walk through that namespace's /proc, I'm > > going to see the same thing that the processes in the namespace would > > see. So what's the issue? > > > > Similarly, if I somehow manage to walk into the outside /proc, then > > I've pretty much lost regardless of the links. > > Well, there's a couple of reasons: > > * The original AT_NO_JUMPS patchset similarly disabled "proclinks" but > it was sort of all contained within AT_NO_JUMPS. In order to have a > precise 1:1 feature mapping we need this in *some* form (and in v1 the > only way to get it was to add a separate flag). According to the > original O_BENEATH changelog, both you and Al pushed for this to be > part of O_BENEATH. :P :) Now that you mention it, I *think* my reasoning involved a rather different use case: sandboxing. If a task is Capsicum-ified or seccomp()ed such that it can *only* use O_BENEATH or AT_BENEATH, this restriction considerably strengthens the resulting security.