Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp327825pxa; Wed, 12 Aug 2020 03:07:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyy40y/EXasgO/IGM4xbd/x9PvNrg0OnTc1KQd0YRR55c88xNfvUE/jjp3kozQr9xamE7cw X-Received: by 2002:a50:af83:: with SMTP id h3mr12377019edd.139.1597226873430; Wed, 12 Aug 2020 03:07:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597226873; cv=none; d=google.com; s=arc-20160816; b=SONdIIEPWHHFtUokFWTkncS4TbWSM3i9dgG8ug4KsME5PubNeGe3YUgLTYmluQwp97 O8mYO9evdLZorNDhQ2KI3gccdJJPfvYnPrIz8cdqjJvpJfuc0IwffGTBo3eSsQS1Rxi9 0pNB531iSnXbouY+k3odncqXV5xBGs149COY54pH/WwH9yoDidTZx09yaSq6BR0bI8ln eS1qhrpdz4GUzvNekcI8H8JfHHDPWOMLuRsjuGafT6B9VnRt0+/6nghu8isUWspRhUMh kp6sljiSinn8dlvSUf02Y0rY1WWQX6Ke9nUAoXYUrwE4uORXEDlxqAmFZEXozpoR4NmR 1oDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=NXRy2yw29J7GGFZWkjCyWi1acy7uKE2xHcBDVhiHWHQ=; b=wMdSSfym/J99ig4iKE1fcwnMpuz7mDBdZlikMS3k6uLjdNA+MTwimNo2Gf1QYPoyNv EeK/Sgx/ug1G58sME7RqsE+t8zwGv68ticntNx8s77y8E9Zi1g9fPCdhoLVPsJBY66GA ZNzVJwZj71gDvO57/b0Co0gQocVf31nMtLg+IZgdo77eLkBy1LtK0pyeM8A95GfYJBRO VZP7AaP9kaMH/djHU1gYjOew2+M3VzqKN+IFotULhBlEG22CVYrS4u76UwJozzwkH6YI 8D+WFh03ay1jq1M8lCOpLIgNAdfxfrQ38eHwWkaGPlG9Une3ySRcKDoiRpnzyxFbbVxn nHfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e11si903658ejd.431.2020.08.12.03.07.29; Wed, 12 Aug 2020 03:07:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727780AbgHLKG5 (ORCPT + 99 others); Wed, 12 Aug 2020 06:06:57 -0400 Received: from foss.arm.com ([217.140.110.172]:43570 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726722AbgHLKG5 (ORCPT ); Wed, 12 Aug 2020 06:06:57 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AAB2AD6E; Wed, 12 Aug 2020 03:06:55 -0700 (PDT) Received: from C02TD0UTHF1T.local (unknown [10.57.41.8]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 521E23F22E; Wed, 12 Aug 2020 03:06:53 -0700 (PDT) Date: Wed, 12 Aug 2020 11:06:50 +0100 From: Mark Rutland To: "Madhavan T. Venkataraman" Cc: kernel-hardening@lists.openwall.com, linux-api@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org, linux-security-module@vger.kernel.org, oleg@redhat.com, x86@kernel.org Subject: Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor Message-ID: <20200812100650.GB28154@C02TD0UTHF1T.local> References: <20200728131050.24443-1-madvenka@linux.microsoft.com> <20200731180955.GC67415@C02TD0UTHF1T.local> <6236adf7-4bed-534e-0956-fddab4fd96b6@linux.microsoft.com> <20200804143018.GB7440@C02TD0UTHF1T.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 06, 2020 at 12:26:02PM -0500, Madhavan T. Venkataraman wrote: > Thanks for the lively discussion. I have tried to answer some of the > comments below. > > On 8/4/20 9:30 AM, Mark Rutland wrote: > > > >> So, the context is - if security settings in a system disallow a page to have > >> both write and execute permissions, how do you allow the execution of > >> genuine trampolines that are runtime generated and placed in a data > >> page or a stack page? > > There are options today, e.g. > > > > a) If the restriction is only per-alias, you can have distinct aliases > > where one is writable and another is executable, and you can make it > > hard to find the relationship between the two. > > > > b) If the restriction is only temporal, you can write instructions into > > an RW- buffer, transition the buffer to R--, verify the buffer > > contents, then transition it to --X. > > > > c) You can have two processes A and B where A generates instrucitons into > > a buffer that (only) B can execute (where B may be restricted from > > making syscalls like write, mprotect, etc). > > The general principle of the mitigation is W^X. I would argue that > the above options are violations of the W^X principle. If they are > allowed today, they must be fixed. And they will be. So, we cannot > rely on them. Hold on. Contemporary W^X means that a given virtual alias cannot be writeable and executeable simultaneously, permitting (a) and (b). If you read the references on the Wikipedia page for W^X you'll see the OpenBSD 3.3 release notes and related presentation make this clear, and further they expect (b) to occur with JITS flipping W/X with mprotect(). Please don't conflate your assumed stronger semantics with the general principle. It not matching you expectations does not necessarily mean that it is wrong. If you want a stronger W^X semantics, please refer to this specifically with a distinct name. > a) This requires a remap operation. Two mappings point to the same >      physical page. One mapping has W and the other one has X. This >      is a violation of W^X. > > b) This is again a violation. The kernel should refuse to give execute >      permission to a page that was writeable in the past and refuse to >      give write permission to a page that was executable in the past. > > c) This is just a variation of (a). As above, this is not true. If you have a rationale for why this is desirable or necessary, please justify that before using this as justification for additional features. > In general, the problem with user-level methods to map and execute > dynamic code is that the kernel cannot tell if a genuine application is > using them or an attacker is using them or piggy-backing on them. Yes, and as I pointed out the same is true for trampfd unless you can somehow authenticate the calls are legitimate (in both callsite and the set of arguments), and I don't see any reasonable way of doing that. If you relax your threat model to an attacker not being able to make arbitrary syscalls, then your suggestion that userspace can perorm chceks between syscalls may be sufficient, but as I pointed out that's equally true for a sealed memfd or similar. > Off the top of my head, I have tried to identify some examples > where we can have more trust on dynamic code and have the kernel > permit its execution. > > 1. If the kernel can do the job, then that is one safe way. Here, the kernel >     is the code. There is no code generation involved. This is what I >     have presented in the patch series as the first cut. This is sleight-of-hand; it doesn't matter where the logic is performed if the power is identical. Practically speaking this is equivalent to some dynamic code generation. I think that it's misleading to say that because the kernel emulates something it is safe when the provenance of the syscall arguments cannot be verified. [...] > Anyway, these are just examples. The principle is - if we can identify > dynamic code that has a certain measure of trust, can the kernel > permit their execution? My point generally is that the kernel cannot identify this, and if usrspace code is trusted to dynamically generate trampfd arguments it can equally be trusted to dyncamilly generate code. [...] > As I have mentioned above, I intend to have the kernel generate code > only if the code generation is simple enough. For more complicated cases, > I plan to use a user-level code generator that is for exclusive kernel use. > I have yet to work out the details on how this would work. Need time. This reads to me like trampfd is only dealing with a few special cases and we know that we need a more general solution. I hope I am mistaken, but I get the strong impression that you're trying to justify your existing solution rather than trying to understand the problem space. To be clear, my strong opinion is that we should not be trying to do this sort of emulation or code generation within the kernel. I do think it's worthwhile to look at mechanisms to make it harder to subvert dynamic userspace code generation, but I think the code generation itself needs to live in userspace (e.g. for ABI reasons I previously mentioned). Mark.