Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3372341yba; Sat, 11 May 2019 09:13:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqx38wuZ7VpiqBRnCCwCn12vEJhAiSHuXAs61GiArjMUDTVyKnTqz5uGCClZqj3q8L+cOBP/ X-Received: by 2002:a62:a219:: with SMTP id m25mr23067391pff.197.1557591204544; Sat, 11 May 2019 09:13:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1557591204; cv=none; d=google.com; s=arc-20160816; b=Hgu090Jg/csu4Jr0kb7RcOFrSjYt7fr5Kecj8TRXTAXHEjN0aETIzprEOCPAV5Xc3i D9G03WP7LcEoRLiPQiFkYmUsuMUmTzzFg3uSyG33vgU8fAq2D8Q5RGrxewaLFmAVajmb WU5QoMr4IftObUjsZ7nM2/H1Xbmb/AEB1s3CcbH26JSIuTrLhVZmeMkYBHFTvetgs/KF 4xVlNfvVPuOSGjxLwdbC1ya6UnRpT+ubQ8kKThjA41NTywroenTTQyKghybmCB8q3tNT 184TfULkZ7IqOMgc45vVMdeaPbpqhVUsYyi7MB3yCaN/2aeZHXVPc1iQbFxnZPQUqyE5 iBnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=xc/ExzDlQ8jSWqvr/bxt0TvWBR+J0InqbUtvZG/2g6Y=; b=yNoGhYXalMsHIADeaQHhny4ajI6pNVIkMYmPwp/RfPXzvZvzvHFDbIem58TfxVADrG u23Pc1UyaMPXmPrV0ZB7YSwnXZnTH7iaTq7RaLV6j2BJINRyLvcpz5kWcAsGdf0vDbEb D9rKEG42O2U485dYQhwR6YELYfki/XejYyINEWzAkogPL2Q+UdD885OMdbEzd4SBi94Z o1WI4wNaEl1nHA6nQnmqib0O77gpWluO83vcIyfTz6XIxxDPoWJdHhftQjxfrHs2ViGe XLZs3kryesqA1AoO8gGZVmPSJa/P2gBSvJozif25uOioOKNHQcH/P6J8xAT+9ItkRtEh S2QA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f184si13687349pfb.280.2019.05.11.09.12.56; Sat, 11 May 2019 09:13:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728667AbfEKPtl (ORCPT + 99 others); Sat, 11 May 2019 11:49:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:50478 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728618AbfEKPtl (ORCPT ); Sat, 11 May 2019 11:49:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 12BE4AFF1; Sat, 11 May 2019 15:49:38 +0000 (UTC) Date: Sun, 12 May 2019 01:49:23 +1000 From: Aleksa Sarai To: Christian Brauner Cc: Jann Horn , Andy Lutomirski , Aleksa Sarai , Al Viro , Jeff Layton , "J. Bruce Fields" , Arnd Bergmann , David Howells , Eric Biederman , Andrew Morton , Alexei Starovoitov , Kees Cook , Tycho Andersen , David Drysdale , Chanho Min , Oleg Nesterov , Linus Torvalds , Linux Containers , linux-fsdevel , Linux API , kernel list , linux-arch Subject: Re: [PATCH v6 5/6] binfmt_*: scope path resolution of interpreters Message-ID: <20190511154923.z5woxv4dqperuqty@mikami> References: <20190506165439.9155-1-cyphar@cyphar.com> <20190506165439.9155-6-cyphar@cyphar.com> <20190506191735.nmzf7kwfh7b6e2tf@yavin> <20190510204141.GB253532@google.com> <20190510225527.GA59914@google.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="borziu3c5eych3ip" Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20180716 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --borziu3c5eych3ip Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2019-05-11, Christian Brauner wrote: > > In my opinion, the problems here are: > > > > - Apparently some people run untrusted containers without user > > namespaces. It would be really nice if people could not do that. > > (Probably the biggest problem here.) > > I know I sound like a broken record since I've been going on about this > forever together with a lot of other people but honestly, > the fact that people are running untrusted workloads in privileged contai= ners > is the real issue here. I completely agree. It's a shit-show, and it's caused by bad defaults in Docker and (now) podman. To be fair, they both now support rootless containers but the default is still privileged containers. They do support user namespaces (though it should be noted that LXD's support is much nicer from a security standpoint) but unless it's the default the support is almost pointless. In the case of Docker it can lead to some usability issues when you enable it (which I believe is the main justification for it not being the default). > Aleksa is a good friend of mine and we have discussed this a lot so I hope > he doesn't hate me for saying this again: it is crazy that there are cont= ainer > runtimes out there that promise (or at least do not state the opposite) > containers without user namespaces or containers with user namespaces > that allow to map the host root id to anything can be safe. They cannot. Yeah, the fact that we (runc) don't scream from the rooftops that this setup is insecure is definitely a problem. I have mentioned this whenever I've had a chance, but the fact that the most popular runtimes (which use runc) don't use user namespaces compounds the issue. I'm willing to bet that >90% of users of runc-based runtimes don't use user namespaces at all, and this is all down to bad defaults. There are also some other misfeatures we have in runc that we're basically forced to support because some users use them, and we can't really break entire projects (even though it's the projects' fault they have an insecure setup). > It seems to me to be heading in the wrong direction to keep up the > illusion that with enough effort we can make this all nice and safe. > Yes, the userspace memfd hack we came up with is as ugly as a security > patch can be but if you make promises you can't keep you better be > prepared to pay the price when things start to fall apart. > So if this part of the patch is just needed to handle this do we really > want to do all that tricky work or is there more to gain from this that > makes it worth it. I dropped this patch in v7, I don't think it's required for the overarching feature. Looking back on it, it doesn't make much sense given the context that privileged containers are unsafe in the first place. I do think that being able to block introspection might be a useful hardening feature though. During attachment it would be nice to be sure that nothing will be able to touch the attaching process's /proc/$pid -- even itself. --=20 Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH --borziu3c5eych3ip Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEXzbGxhtUYBJKdfWmnhiqJn3bjbQFAlzW7wAACgkQnhiqJn3b jbR3bxAAnM8LJPMf2Pan3q301DdRUniZUy671tYDuLAvqlkzeM+iitQn3MFlLSwS /vr4dbITCJIWK8vnSv4W2E1o8MJdiVuKRXQHvFUUGwm4UEyjmr7OXE5ExqD4nGUl BsaJeOUjmIJ18qnQGC3fcbxki14L7320aswV0bkylxulAJlzoK35Uerc5gp6rzrn zjXlcmTguykS8HgZrg+F0Dx2SfSH0au28EOTpxe9Go/Y4PcuVc5qWn4A3rZW+mLQ bGffGaYxpuubJku7mQW+fg8NZjMKCIl72abGAQkEoVQGLDuu9Wpgk8cmBHSrVB9l OIANqypYyJw8SBlL75aWXAKLDfBhkxmF9TyFBLvUuMNNWqibbx518saj2/jbgke6 medifvB7Fq+RpBdJXhAeFemhZnXf3MlF16o55N7XfEt+J9TBMH8YsbdOv8tlZwFQ sipQ9+ADbAk7qVRXXmrkrO7Ne359DKZfT7csyXFzwbRBJLyVUdlqsw3hAJIBeaGB UyLf0JxF0P2qZ0QSptixnjPp9gnvh1XL/NyzhFPDbUHXF1vJ/BdzmcTBi/s6O+gm pEIWxNjSY1c/wsE7w1ZJUaXwo7ePeGsbyGX+UjDQ4gbYVEnfDoHfNAjWeFeQKEjh u2n1yLOFsuBZLlwZuUPsTKB02EUGMneVoc4JUPOCB+zSKN9LBRE= =SjiX -----END PGP SIGNATURE----- --borziu3c5eych3ip--