Received: by 2002:a05:7412:5112:b0:fa:6e18:a558 with SMTP id fm18csp680983rdb; Tue, 23 Jan 2024 11:17:35 -0800 (PST) X-Google-Smtp-Source: AGHT+IEVGhikE70B/W+tABstI3YXX0J2foZjbHiGYKcTPVLKeLljqyElZtX0i7spEcGq2mIpKeUK X-Received: by 2002:a62:5e82:0:b0:6db:c5db:7711 with SMTP id s124-20020a625e82000000b006dbc5db7711mr2955416pfb.26.1706037455307; Tue, 23 Jan 2024 11:17:35 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706037455; cv=pass; d=google.com; s=arc-20160816; b=0L8sENroEOet67gZMNiaKvgrRif94HAJ2YCXIVJMERsfQWPPygSVlYT/D4FFRbqQ7p K1TXlry0HWcmmJDIUmcUSZYRBcFOOtaHeblOusOpofWEaCKLMGg51LJVI1jhLPMLrGXy xp8ohVPHGwKTJIl18yem3J45+oF5l4Ajm0/hhpREzDCFMoYN8ExVYe4GsMrj6sI3NOqd gOwtvPGUpvT6EDEuhrd7jAaeFyAEzOHX81JD1Llu2mUplXUOlDq8peD2jfQe9Eh9/iy+ BHPbGYTUBiuGqtmy80CDibSi/M7xYufRxsWpsQdGv1kwld2rB0fAhfaknHdWlMYtKaN4 0ihg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:content-transfer-encoding:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:comments :references:in-reply-to:subject:mail-followup-to:to:from :dkim-signature; bh=cpWjnKjDiAq8IzDesDTjxbv8ODKajI0ABce6SDLa4/A=; fh=Wluo3vv1dnO/TTRLxc8VozpcLKakaf4/ujleTjsyTw4=; b=Mh9+plRQd1FiyYLfVDKxLNMRHH5zvAfKKR6J34BTn0ZTUoI++e5zNfO36P+gexihCH IoDHscLdgg+4Veq3LfCYf8FpPLUhXqhEltJi1MqU1a3sS6Ej50tCiGygKqiZy1FjL8m0 WOgttbFuhHGJGJ/aCCR8Mja2ekEMPtbYa2sfFlu2siLxD/5OujYVb1I/Hgg7IXgoIXRR /aRfNHedejozLyccamGYJ0tukdDWoKHPYESNPbFg37w1VZvbPCVqHQie5FYprnk+6uXK qC8yv0W0NGTB3YMUl0F+AwJPdS0hpik/CdSsdQ1TlhkCBsNwtRxtdMUdxD4+a+vtdL4w ef/A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@openbsd.org header.s=selector1 header.b=3HN6C8AJ; arc=pass (i=1 spf=pass spfdomain=openbsd.org dkim=pass dkdomain=openbsd.org); spf=pass (google.com: domain of linux-kernel+bounces-35946-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35946-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id jc13-20020a056a006c8d00b006dd7a8b6c04si1334293pfb.359.2024.01.23.11.17.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 11:17:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-35946-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@openbsd.org header.s=selector1 header.b=3HN6C8AJ; arc=pass (i=1 spf=pass spfdomain=openbsd.org dkim=pass dkdomain=openbsd.org); spf=pass (google.com: domain of linux-kernel+bounces-35946-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35946-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 892DC2957B4 for ; Tue, 23 Jan 2024 19:04:41 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E117782D94; Tue, 23 Jan 2024 18:58:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=openbsd.org header.i=@openbsd.org header.b="3HN6C8AJ" Received: from cvs.openbsd.org (cvs.openbsd.org [199.185.137.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62856823C6; Tue, 23 Jan 2024 18:58:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=199.185.137.3 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706036332; cv=none; b=oCzn7ZVQ23Fxs57fuC9TGRC0lwv1KBdHSZqOmLX3/aumepLsyQY1IzwiuK8D2l5DPzkwb8EIh4V8yp2wziWZc48A0+ympPbqQajBW6fkPWa0IAO8GOVeYaYK70NLsgSrQaGUr3MdUx6N0pU/wqWZKw7b89rV+Y8BmAs7p3+FDJw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706036332; c=relaxed/simple; bh=85z1pDLNzf/l+6g12yGP+kL48gcbmAt/Exa7E4EizGU=; h=From:To:Subject:In-reply-to:References:MIME-Version:Content-Type: Date:Message-ID; b=UGGTZb2XQHE1n3FPNWjsIw7rFBSmwe3tsloCu9e5xn1ykRtzyEq/78FJdwjhol95cI+nvXO+ZGsIsCPfL7IA+3xUcMHP9vCPmK96gDrLZUL598EUHnz0UDB6nLMC6UD+KinB3tcAM7W7DHG34jB/aYiWYYHrtRKBjTQVrNiH2cc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=openbsd.org; spf=pass smtp.mailfrom=openbsd.org; dkim=pass (2048-bit key) header.d=openbsd.org header.i=@openbsd.org header.b=3HN6C8AJ; arc=none smtp.client-ip=199.185.137.3 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=openbsd.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=openbsd.org DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; s=selector1; bh=85z1pDLNzf /l+6g12yGP+kL48gcbmAt/Exa7E4EizGU=; h=date:references:in-reply-to: subject:to:from; d=openbsd.org; b=3HN6C8AJTcP6apML6GvYnBtUjVM7yvJM4qGN fv1+L1Wt7+NSn4MNtQp6zwhhyrrCPm7inbeZioIntZNMgauhNHg8glINfZ8Bq7BevPRSNg ezzIyBqUhFDpFmfuuaePpLXwlNHz8pDVc8IGXxBtGszoTQDe6t0jAOaNupiENVkH1uT7DX Ej1vIXax9CpDddiomfw4zlm1DaSX/L+Z97aetWT7U/9Zo9zvquXPXyfRZ8+lt1Oj+9vqov 2ylBate61fs0rb5U2+LKXMd7dtPKSrdVavU1JYms6RQ8kYs4o2InmG9r3PmaOPCIEafFKQ 5XujUoKXxpuyzb/C11Aj8M/TGA== Received: from cvs.openbsd.org (localhost [127.0.0.1]) by cvs.openbsd.org (OpenSMTPD) with ESMTP id f1a5780b; Tue, 23 Jan 2024 11:58:41 -0700 (MST) From: "Theo de Raadt" To: "Liam R. Howlett" , Jeff Xu , akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org, usama.anjum@collabora.com, rdunlap@infradead.org, jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org Mail-Followup-To: "Liam R. Howlett" , Jeff Xu , akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org, usama.anjum@collabora.com, rdunlap@infradead.org, jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org Subject: Re: [PATCH v7 0/4] Introduce mseal() In-reply-to: <20240123173320.2xl3wygzbxnrei2c@revolver> References: <20240122152905.2220849-1-jeffxu@chromium.org> <726.1705938579@cvs.openbsd.org> <86181.1705962897@cvs.openbsd.org> <20240123173320.2xl3wygzbxnrei2c@revolver> Comments: In-reply-to "Liam R. Howlett" message dated "Tue, 23 Jan 2024 12:33:20 -0500." Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 23 Jan 2024 11:58:41 -0700 Message-ID: <85359.1706036321@cvs.openbsd.org> Liam R. Howlett wrote: > * Theo de Raadt [240122 17:35]: > > Jeff Xu wrote: > >=20 > > > On Mon, Jan 22, 2024 at 7:49=E2=80=AFAM Theo de Raadt wrote: > > > > > > > > Regarding these pieces > > > > > > > > > The PROT_SEAL bit in prot field of mmap(). When present, it marks > > > > > the map sealed since creation. > > > > > > > > OpenBSD won't be doing this. I had PROT_IMMUTABLE as a draft. In = my > > > > research I found basically zero circumstances when you userland does > > > > that. The most common circumstance is you create a RW mapping, fil= l it, > > > > and then change to a more restrictve mapping, and lock it. > > > > > > > > There are a few regions in the addressspace that can be locked whil= e RW. > > > > For instance, the stack. But the kernel does that, not userland. I > > > > found regions where the kernel wants to do this to the address spac= e, > > > > but there is no need to export useless functionality to userland. > > > > > > > I have a feeling that most apps that need to use mmap() in their code > > > are likely using RW mappings. Adding sealing to mmap() could stop > > > those mappings from being executable. Of course, those apps would > > > need to change their code. We can't do it for them. > >=20 > > I don't have a feeling about it. > >=20 > > I spent a year engineering a complete system which exercises the maximum > > amount of memory you can lock. > >=20 > > I saw nothing like what you are describing. I had PROT_IMMUTABLE in my > > drafts, and saw it turning into a dangerous anti-pattern. > >=20 > > > Also, I believe adding this to mmap() has no downsides, only > > > performance gain, as Pedro Falcato pointed out in [1]. > > >=20 > > > [1] https://lore.kernel.org/lkml/CAKbZUD2A+=3Dbp_sd+Q0Yif7NJqMu8p__eb= 4yguq0agEcmLH8SDQ@mail.gmail.com/ > >=20 > > Are you joking? You don't have any code doing that today. More feelin= gs? >=20 > The 'no downside" is to combining two calls together; mmap() & mseal(), > at least that is how I read the linked discussion. >=20 > The common case (since there are no users today) of just calling > mmap()/munmap() will have the downside. >=20 > There will be a performance impact once you have can_modify_mm() doing > more than just returning true. Certainly, the impact will be larger > in munmap where multiple VMAs may need to be checked (assuming that's > the plan?). >=20 > This will require a new and earlier walk of the vma tree while holding > the mmap_lock. Since you are checking (potentially multiple) VMAs for > something, I don't think there is a way around holding the lock. >=20 > I'm not saying the cost will be large, but it will be a positive > non-zero number. For future glibc changes, I predict you will have zero cases where you can call mmap+immutable or mprotect+immutable, I say so, because I ended up having none. You always have to fill the memory. (At first glance you might think it works for a new DSO's BSS, but RELRO overlaps it, and since RELRO mprotect happens quite late, the permission locking is quite delayed relative to the allocation). I think chrome also won't lock memory at allocation. I suspect the generic allocator is quite seperate from the code using the allocation, which knows which objects can have their permissions locked and which objects can't. In OpenBSD, the only cases where we could set immutable at the same time as creating the mapping was in execve, for a new process's stack regions, and that is kernel code, not the userland exposed system call APIs. =20 This change could skip adding PROT_MSEAL today, and add it later when there are facts the need. It's the same with MAP_MSEALABLE. I don't get it. So now there are 3 memory types: - cannot be sealed, ever - not yet sealed - sealed What purpose does the first type serve? Please explain the use case. Today, processes have control over their entire address space. What is the purpose of "permissions cannot be locked". Please supply an example. If I am wrong, I'd like to know where I went wrong.