Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1467772pxb; Wed, 4 Nov 2020 09:40:12 -0800 (PST) X-Google-Smtp-Source: ABdhPJxrgl01oYTiVUrFRfYOTNAPY2gk1zWoWTjnSAk02aRKbdxzCRCPdD9Snr7uSbwSrsFk2h8+ X-Received: by 2002:a50:b023:: with SMTP id i32mr27035386edd.377.1604511612632; Wed, 04 Nov 2020 09:40:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604511612; cv=none; d=google.com; s=arc-20160816; b=W8KUZjaX7nNAmwElYObW1AT8GWfW8wEJq1pqNlxp66S3nMlplKRRuv4DTrOaiEGxMa ScwDoGYakOaDDWuPbh7tzBYVIi6SGGxcJoXYGIP4v4nlVvp7clnERF36sNK7xM2t5o4Z cZybM8hr1MunopZTaHMNErZ9K4H4+prOZeQj6alkvPgiOS53PIlWx2hxb4BfByPvQ/JH WGNOyJd1uVJ/T4azJmloP8gIYhrgU/dqN20117UGOesiE9mi5znooybTJdlKKC39sBtw bkapoYWCQln0euUbvwFG5rYY0CZNa+aQ2gMwG2mAy7z0CqiIGLK2xfej4VxjieikC97X zpMw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Z/7OzJ7NI9Cc4OZs2IF4BG3jrW157axF1ZVOosfLJlI=; b=tWuuTu9bqYzgxzX3cMCMOiQVv8KBbp6dEd65EW0Ln4x+vmN7bdPJ/iala706hglGBA wMoo4yt+sH9cCdVF4IAak/sAn7HZBN7Au5gK8K4qflNr8hADautxIULKXi98dimPUHfm 7tHZtVkwh1rdKHEQbqss1IzHbzqbNgQW1BimJg0s7tZpvl+SX5f6ElRqptr9JtJFdH/F eXRgsdsF1wOKEeXeJREcRpJGBOgnnO1AKAZ+cGfAyzR/LbokxaahbuPO3JnNH+GlZwas 3d6ZTx9bePUgdhaK5oHBzuDkvSA9Zz4BHnf08efK2FD88z+vm1M706OqRbVTWMLT9jIe mDFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c25si1746978ejs.444.2020.11.04.09.39.43; Wed, 04 Nov 2020 09:40:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730627AbgKDPUj (ORCPT + 99 others); Wed, 4 Nov 2020 10:20:39 -0500 Received: from foss.arm.com ([217.140.110.172]:38746 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730362AbgKDPUj (ORCPT ); Wed, 4 Nov 2020 10:20:39 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A500F139F; Wed, 4 Nov 2020 07:20:38 -0800 (PST) Received: from C02TD0UTHF1T.local (unknown [10.57.57.109]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 61A553F719; Wed, 4 Nov 2020 07:20:35 -0800 (PST) Date: Wed, 4 Nov 2020 15:20:32 +0000 From: Mark Rutland To: Topi Miettinen Cc: Florian Weimer , Will Deacon , Mark Brown , Szabolcs Nagy , libc-alpha@sourceware.org, Jeremy Linton , Catalin Marinas , Kees Cook , Salvatore Mesoraca , Lennart Poettering , linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kernel-hardening@lists.openwall.com, linux-hardening@vger.kernel.org Subject: Re: [PATCH 0/4] aarch64: avoid mprotect(PROT_BTI|PROT_EXEC) [BZ #26831] Message-ID: <20201104152032.GC7577@C02TD0UTHF1T.local> References: <20201103173438.GD5545@sirena.org.uk> <20201104092012.GA6439@willie-the-truck> <87h7q54ghy.fsf@oldenburg2.str.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 04, 2020 at 11:55:57AM +0200, Topi Miettinen wrote: > On 4.11.2020 11.29, Florian Weimer wrote: > > * Will Deacon: > > > > > Is there real value in this seccomp filter if it only looks at mprotect(), > > > or was it just implemented because it's easy to do and sounds like a good > > > idea? > > > > It seems bogus to me. Everyone will just create alias mappings instead, > > just like they did for the similar SELinux feature. See “Example code > > to avoid execmem violations” in: > > > > > > Also note "But this is very dangerous: programs should never use memory > regions which are writable and executable at the same time. Assuming that it > is really necessary to generate executable code while the program runs the > method employed should be reconsidered." Sure, and to be clear we're not trying to violate the "at the same time" property. We do not want to permit simultaneous PROT_WRITE and PROT_EXEC at any instant in time. What we're asking is to not block changing permissions to PROT_EXEC in the absence of PROT_WRITE. I think that the goal of preventing WRITE -> EXEC transitions for some memory is sane, but I think the existing kernel primitives available to systemd don't allow us to do that in a robust way because we don't have all the relevant state tracked and accessible, and the existing approach gets in the way of doing the right thing for other mitigations. Consequently I think it would be better going forward to add a more robust (kernel) mechanism for enforcement that can distinguish WRITE->EXEC from EXEC->EXEC+BTI, and e.g. can be used to forbid aliasing mappings with differing W/X permissions. Then userspace could eventually transition over to that and get /stronger/ protection while permitting the BTI case we'd like to work now. > If a service legitimately needs executable and writable mappings (due to > JIT, trampolines etc), it's easy to disable the filter whenever really > needed with "MemoryDenyWriteExecute=no" (which is the default) in case of > systemd or a TE rule like "allow type_t self:process { execmem };" for > SELinux. But this shouldn't be the default case, since there are many > services which don't need W&X. > > I'd also question what is the value of BTI if it can be easily circumvented > by removing PROT_BTI with mprotect()? I agree that turning BTI off is a concern, and to that end I'd like to add an enforcement mechanism whereby we could prevent that (ideally the same mechanism by which we could prevent WRITE -> EXEC transitions). But, as with all things it's a matter of degree. MDWE and BTI are both hurdles to an adversary, but neither are absolutes and there are approaches to bypass either. By the time someone's issuing mprotect() with an arbitrary VA and/or prot, they are liable to have been able to do the same with mmap() and circumvent MDWE. I'd really like to not have BTI silently disabled in order to work with MDWE, because the risk is that it gets silently disabled elsewhere. The risk of the changing the kernel to enable BTI for a binary is not well known since we don't control other peoples libraries that might end up not being compatible somehow with that. The risk of disabling a portion of the MDWE protections seems to be the least out of the options we have available, as unfortunate as it seems, and I think we can come up with a better MDWE approach going forward. Thanks, Mark.