Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp2023582rdb; Wed, 31 Jan 2024 17:27:35 -0800 (PST) X-Google-Smtp-Source: AGHT+IGLQH3G9Nu2KH+oqa01GqVSV2YnBtTddsKTG22E2qwzYPk6pkF8fY/mcm0cHyqoIeZK3NMS X-Received: by 2002:a05:620a:841:b0:783:9072:9b46 with SMTP id u1-20020a05620a084100b0078390729b46mr1157905qku.20.1706750854836; Wed, 31 Jan 2024 17:27:34 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706750854; cv=pass; d=google.com; s=arc-20160816; b=Gpk1Mi2H/hCo76pp7FB9KixGZj4FY+LA15oTHrKUP76uUhw7HYxDLcsigfQ3dHbEeP VQYE4iUDjBtHEcBPb9+N4Qme8pGuQPVsUrI2DzwSdRua2O6aixAPmg+aPZU1X0VrHulL n1LT3970qd0XV+Nn4zNChLfuGXAfze5r1yk2OE9kQ9IQk/m8o/WwlP+OmTRxOwo/9nIf r3Otu7k2LhRLGKZz8A4LlGNqHNmSEmJGqU5XyNXbiVTQwxforrb4E8GHrfq1nPbXCcrN 6WKkyfqrgSyQMX/FiPDR2gU1UNnLQ/SKmkqJcm+NHkxDoSWFquu44GFCTGsu6fngAWL+ 8OSw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=sDzJ7JgQvwxL4RB0XYtG8kPAD5jbJ2pvE+FQi7xvQPU=; fh=xYVySIuZXZhJGxtQkxzJDk4X0TkPZE2i9AwKEUerBl0=; b=LcnuOPbaNsQn3zggivbp3jYdXTVMq4AiT3xuyZYCVCTjMQS72Rfn41CNlxSXPYwJiH 7YA2GkRUlxN8BCvzxI/c9ZdF/qK+oF11a0XFn2jL+cPPo00nejHZoK9FiBecnkhyTcxn JahUdMhSDl2UXfzrltWSYLdTv7PZ7bNfzgSxvrI2zvqBujlyhup9/Gb3KmtKN4f2g3BW qUaHddP1VQbkR+z6bp5QpeQDZX9vz4BE0xIdDv8hICUNutuUgcr+XbmleLeamk8+Lr4U PcrbvlSn7a1JQo6GgWd6yWLuBDdWTg1UcTr6q6ep3PFxjfYIjfadm2AQS5JL01fFGM4j Nj0g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mjg46Iif; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-47496-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47496-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org X-Forwarded-Encrypted: i=1; AJvYcCVgbQO9Wlpg6vpZRxc6IR9VYdUL34oOREbaz7qk1TPoQ6KusbCl1ZnNlEF0qBnhV+XTjPeMTuTTvu4JPKL+pYnoi04qzMUktfP8dSeWRA== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id b19-20020a05620a0cd300b00783f2b6826asi9674774qkj.145.2024.01.31.17.27.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 17:27:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-47496-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mjg46Iif; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-47496-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47496-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 8678C1C2ADA7 for ; Thu, 1 Feb 2024 01:27:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D61004A2A; Thu, 1 Feb 2024 01:27:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="mjg46Iif" Received: from mail-oo1-f50.google.com (mail-oo1-f50.google.com [209.85.161.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2BCDA4403 for ; Thu, 1 Feb 2024 01:27:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706750845; cv=none; b=Zf+c6B+r5n+MFpCna94W4sRZwNtc5mx9C1R2vDf/je/aqp8FYxt4B2JT0DOvoD32Uqa8B0XNduO5X+aGsjpWMZxvPLvFhyRz5HaH9btWnEofpOKCbFUg0dhEImgS8quhtOl6pZwB6AWHw2RJ+eXwPpJxXgjaTVeJOsLYcHYJAiU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706750845; c=relaxed/simple; bh=BQfrkfnAjRHW3tpPJSA0bCjLjUMIPrCPNnB/AXFgCO0=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Content-Type; b=pWl+4PgUMJL8QTbjZbtTqxX3+VM5Uxp2qAOxVSCVXxI0Zib6KwLH4QhdcE+LEj+czyPcXljyZcOnDW5+ouCtrwTvjpIzpKvvb0iSHwsVoIIPrrPNSdjAytbSPhqvALJlL2nnU+vM4FnF4rnyB3NbZKrAWfW0N5/GaYuyAAU7lz8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=mjg46Iif; arc=none smtp.client-ip=209.85.161.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Received: by mail-oo1-f50.google.com with SMTP id 006d021491bc7-594cb19c5d9so624712eaf.0 for ; Wed, 31 Jan 2024 17:27:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1706750843; x=1707355643; darn=vger.kernel.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sDzJ7JgQvwxL4RB0XYtG8kPAD5jbJ2pvE+FQi7xvQPU=; b=mjg46Iifo7aTLzPS5SP/EHQpyYgBjzADJcvv2oA8/jHWI0nJ0NmnJ51XFM/7rNTuso YKQlmabt8guNijLJpUkNalvermaVyTKU3530A4aBTG5b+Apyx34EeAdAzAtOlCM6yEYv lXhW2oZkcpjKFayh2gNph3nv6XYbg4K7UmGho= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706750843; x=1707355643; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sDzJ7JgQvwxL4RB0XYtG8kPAD5jbJ2pvE+FQi7xvQPU=; b=i64DmafvmLcNbSk98QG+Y7GTuh2gpjLiQUC2UHRpfq1NZIcZHvKeKC6S1NLncJne0O f4a9CX9dNGtKjmQLQw3zCwHfHGIxBbuapTnDWpLa3p9ecIDaZardzQ1p5ekvICkbzFrf 0K7chImGsHPl3ua1kX8J8YvTxMwYvEXG+lMC4/EDaepSOwfhUM/7C39hy6Xcy+5PXPHu i1VsQ/bqOnbchUdFHPx53dSXgHEqQzZV4KAFMyXlWURvny46m+d/kxZVGJnnGCEZr2Je JxLvrkEWLUEsfmFfrO9+7/Nurp6g8eq/gQIuP7LELUg78KmbSXVjZ2RnqGpITbhNiPXD OZ+A== X-Gm-Message-State: AOJu0YznEkECx50oQqbmXBbODv30mtFLhQhRE+xzmz0QbYqNH39+Ee1n IY/O2Ix+aekVA/eBoZ1534kG6+zfzRoWdNjSEYflh0lY2YFE5W2+5rUhLYDH+itVOSIrP1poyq4 TU3jqgwwJwtszr4raRX9Bn8IbuFoomVS23pj7 X-Received: by 2002:a05:6871:a417:b0:210:dcdc:be39 with SMTP id vz23-20020a056871a41700b00210dcdcbe39mr1311243oab.20.1706750843175; Wed, 31 Jan 2024 17:27:23 -0800 (PST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240131175027.3287009-1-jeffxu@chromium.org> <20240131193411.opisg5yoyxkwoyil@revolver> In-Reply-To: <20240131193411.opisg5yoyxkwoyil@revolver> From: Jeff Xu Date: Wed, 31 Jan 2024 17:27:11 -0800 Message-ID: Subject: Re: [PATCH v8 0/4] Introduce mseal To: "Liam R. Howlett" , jeffxu@chromium.org, Jonathan Corbet , akpm@linux-foundation.org, keescook@chromium.org, jannh@google.com, sroettger@google.com, willy@infradead.org, gregkh@linuxfoundation.org, torvalds@linux-foundation.org, usama.anjum@collabora.com, rdunlap@infradead.org, jeffxu@google.com, jorgelo@chromium.org, groeck@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, pedro.falcato@gmail.com, dave.hansen@intel.com, linux-hardening@vger.kernel.org, deraadt@openbsd.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Jan 31, 2024 at 11:34=E2=80=AFAM Liam R. Howlett wrote: > > Please add me to the Cc list of these patches. Ok. > > * jeffxu@chromium.org [240131 12:50]: > > From: Jeff Xu > > > > This patchset proposes a new mseal() syscall for the Linux kernel. > > > > In a nutshell, mseal() protects the VMAs of a given virtual memory > > range against modifications, such as changes to their permission bits. > > > > Modern CPUs support memory permissions, such as the read/write (RW) > > and no-execute (NX) bits. Linux has supported NX since the release of > > kernel version 2.6.8 in August 2004 [1]. The memory permission feature > > improves the security stance on memory corruption bugs, as an attacker > > cannot simply write to arbitrary memory and point the code to it. The > > memory must be marked with the X bit, or else an exception will occur. > > Internally, the kernel maintains the memory permissions in a data > > structure called VMA (vm_area_struct). mseal() additionally protects > > the VMA itself is against modifications of the selected seal type. > > ... The v8 cut Jonathan's email discussion [1] off and > instead of > replying there, I'm going to add my question here. > > The best plan to ensure it is a general safety measure for all of linux > is to work with the community before it lands upstream. It's much > harder to change functionality provided to users after it is upstream. > I'm happy to hear google is super excited about sharing this, but so > far, the community isn't as excited. > > It seems Theo has a lot of experience trying to add a feature very close > to what you are doing and has real data on how this went [2]. Can we > see if there is a solution that is, at least, different enough from what > he tried to do for a shot of success? Do we have anyone in the > toolchain groups that sees this working well? If this means Stephen > needs to do something, can we get that to happen please? > For Theo's input from OpenBSD's perspective; IIUC: as today, the mseal-Linux and mimmutable-OpenBSD has the same scope on what operations to seal, e.g. considering the progress made on both sides since the beginning of the RFC: - mseal(Linux): dropped "multiple-bit" approach. - mimmutable(OpenBSD): Dropped "downgradable"; Added madvise(DONOTNEED). The difference is in mmap(), i.e. - mseal(Linux): support of PROT_SEAL in mmap(). - mseal(Linux): use of MAP_SEALABLE in mmap(). I considered Theo's inputs from OpenBSD's perspective regarding the difference, and I wasn't convinced that Linux should remove these. In my view, those are two different kernels code, and the difference in Linux is not added without reasons (for MAP_SEALABLE, there is a note in the documentation section with details). I would love to hear more from Linux developers on this. > I mean, you specifically state that this is a 'very specific > requirement' in your cover letter. Does this mean even other browsers > have no use for it? > No, I don=E2=80=99t mean =E2=80=9Cother browsers have no use for it=E2=80= =9D. About specific requirements from Chrome, that refers to "The lifetime of those mappings are not tied to the lifetime of the process, which is not the case of libc" as in the cover letter. This addition to the cover letter was made in V3, thus, it might be beneficial to provide additional context to help answer the question. This patch series begins with multiple-bit approaches (v1,v2,v3), the rationale for this is that I am uncertain if Chrome's specific needs are common enough for other use cases. Consequently, I am unable to make this decision myself without input from the community. To accommodate this, multiple bits are selected initially due to their adaptability. Since V1, after hearing from the community, Chrome has changed its design (no longer relying on separating out mprotect), and Linus acknowledged the defect of madvise(DONOTNEED) [1]. With those inputs, today mseal() has a simple design that: - meet Chrome's specific needs. - meet Libc's needs. - Chrome's specific need doesn't interfere with Libc's. [1] https://lore.kernel.org/all/CAHk-=3DwiVhHmnXviy1xqStLRozC4ziSugTk=3D1JO= c8ORWd2_0h7g@mail.gmail.com/ > I am very concerned this feature will land and have to be maintained by > the core mm people for the one user it was specifically targeting. > See above. This feature is not specifically targeting Chrome. > Can we also get some benchmarking on the impact of this feature? I > believe my answer in v7 removed the worst offender, but since there is > no benchmarking we really are guessing (educated or not, hard data would > help). We still have an extra loop in madvise, mprotect_pkey, mremap_to > (and mreamp syscall?). > Yes. There is an extra loop in mmap(FIXED), munmap(), madvise(DONOTNEED), mremap(), to emulate the VMAs for the given address range. I suspect the impact would be low, but having some hard data would be good. I will see what I can find to assist the perf testing. If you have a specific test suite in mind, I can also try it. > You also did not clean up the loop you copied from mlock, which I > pointed out [3]. Stating that your copy/paste is easier to review is > not sufficient to keep unneeded assignments around. > OK. > [1]. https://lore.kernel.org/linux-mm/87a5ong41h.fsf@meer.lwn.net/ > [2]. https://lore.kernel.org/linux-mm/86181.1705962897@cvs.openbsd.org/ > [3]. https://lore.kernel.org/linux-mm/20240124200628.ti327diy7arb7byb@rev= olver/