Received: by 2002:a05:7412:1e0b:b0:fc:a2b0:25d7 with SMTP id kr11csp35621rdb; Wed, 14 Feb 2024 11:51:14 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVfAJ23jH3yNgtLCCQL9RXUQwKnbU+fMH1hDfoZOaujrdztsSu9kT5dVTVlnHPQOoYMGPkCUg8qjp3NicxF8TXPZIoMIBTc573F933kvQ== X-Google-Smtp-Source: AGHT+IGbVqr7TWXQ+h67SoSvrOwoEafcODjfcQYA2gBgifAZoaK5nOVkhjcXIehwQrgPgiYTzRbL X-Received: by 2002:a17:90b:2306:b0:298:b504:ad8f with SMTP id mt6-20020a17090b230600b00298b504ad8fmr3739649pjb.18.1707940274235; Wed, 14 Feb 2024 11:51:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707940274; cv=pass; d=google.com; s=arc-20160816; b=FFAAf3KC1ijJBQvWjY6mb0KuJw81CdQjtfrYcOz/7DWFVBW3WJI2BQBZN03R8H0kDW JydiHhl1c5Z97JwV+1mruvN2FqUS6CxsIh7OIWK5g9hk2OljqGVRImI8lFSXCSCfCW1X AIUabW77kVtIgvtQGPVBvgHNuGYtrvtAZ9DMjHVOXgOudkRSZ0NCWtF7RT6ar+lBcfWN 1Qe0qF8rTJIl3HhtSvt6g/w1GndElCdeFBEVm2mOQ61MUQv79Lyk+9WgYD9HEM9zcVPy X/E6OKpvHZ+xRyAd4hGUC9FtEbwa0MTRT9x00wG042ZB2Z/tcsm5tvK0ZSH3FgwmiFc9 1Y/Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=+kn6Uw7nhL08sWX9dCxTs5rcdQk6XBojfgI0WQMAyeM=; fh=IIF3NuYpGUNsjJC/1bze3SGzqXkYuutQqta8U+sGZ34=; b=Coxr5cM16+T+0utXJ6ygdckiD8fCw1R4C1/ZPF+MBm0ahAtyHowMN5YGCSAqFCI8Z2 CJXYK5e81qp0rHfSG0Sbb07gi8mA0Zd2I3DeG6qJocSTDxNOShllPE0qFXoIwJDXVOuj 2z0k8TEyS7JEGY4A1m09GPIKgaxh2E+t8o5BogZ4/i0xIwOlAHrq2e6FPEU8z0iv5pDG HfmGbf9eOJb7xM+GpzW4Kpd9T2+U1+mjL0GfQJhNYCSHnVbvBuqFr06llqr8/UwCDL+e nBEu3nuHBPCVsseL7cbKIhFILqN9GczNgndf7RNVDHbiZkXPoEphuomT+rx3jzULADFs IGHg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=usCflKiA; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-65868-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65868-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id v12-20020a17090a088c00b00297041296dasi1650664pjc.34.2024.02.14.11.51.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 11:51:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-65868-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=usCflKiA; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-65868-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65868-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 83083B2C666 for ; Wed, 14 Feb 2024 19:14:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3729413B7B3; Wed, 14 Feb 2024 19:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b="usCflKiA" Received: from bee.tesarici.cz (bee.tesarici.cz [77.93.223.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74A4C134CCA; Wed, 14 Feb 2024 19:14:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=77.93.223.253 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707938062; cv=none; b=XocY33qO6/Jij2Djm7EOBbKCde/sup/PWH9FJzvxcYy/FLU/Nv++FK36diUh3MdNHqvHw+I6ctEs6e8AL1868pznyRepIPqYFWlHB24bocQKHRJuIKTeK0lXczX34GUslFXXlgR9AcZ8I0J/66KTiInlc+fDlMcU10z+abXc4Kg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707938062; c=relaxed/simple; bh=dtoL+qo2rjHSs2RZElyIu6BxXT9O3vZzuja1aFzJFgE=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ZyS/pba5sgCdDhNitcyqlYCNM+kBo6oh6RPoM8P4Pbllr8r/eTMbq0xpGoNNqXjXK+XG1rIVd3VdJZ+5TLOrL6TUTBKbrQ7HxLwhM5YnnXyZ0VkkfyvXSREe9Q91riiobnxUIJ8UiJdUdgQ3+Yjn4gEC/bViZsTMlZlB3U8u32c= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz; spf=pass smtp.mailfrom=tesarici.cz; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b=usCflKiA; arc=none smtp.client-ip=77.93.223.253 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tesarici.cz Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id B035B1A3C02; Wed, 14 Feb 2024 20:14:16 +0100 (CET) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=quarantine dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tesarici.cz; s=mail; t=1707938057; bh=+kn6Uw7nhL08sWX9dCxTs5rcdQk6XBojfgI0WQMAyeM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=usCflKiA3JOi0sndHsKiLct6zIXtCx2pAo2z6R8RG9ZpngNDPs6z9WIaKKrj5GjIl tDHR6M2vAxsifYMkJ162tDT1Na1kXS9h1CRw8bvhvtL3JuYHtL9sNlSOcViacWKgas tV0H2mY6PppGuPpBdgd/qBTAoqeh+MHxJFuCkCy72TBKG55Ak9xHJwxYWGlCNE/GDd wITsdEJ2GsMuiPbMGglmaS04jGW//E6Rddb4knK1d0SsGGKvhTWdR9XJdeLCZ4iVTt W1SwTq78Xwah5I93RkeHeAzgwt/UpxkV0rP/JdAUZje+1zf/V2enSndWWMthfZK2lS RP5m4YTchyBqg== Date: Wed, 14 Feb 2024 20:14:15 +0100 From: Petr =?UTF-8?B?VGVzYcWZw61r?= To: "H. Peter Anvin" Cc: Dave Hansen , Petr Tesarik , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Andy Lutomirski , Oleg Nesterov , Peter Zijlstra , Xin Li , Arnd Bergmann , Andrew Morton , Rick Edgecombe , Kees Cook , "Masami Hiramatsu (Google)" , Pengfei Xu , Josh Poimboeuf , Ze Gao , "Kirill A. Shutemov" , Kai Huang , David Woodhouse , Brian Gerst , Jason Gunthorpe , Joerg Roedel , "Mike Rapoport (IBM)" , Tina Zhang , Jacob Pan , "open list:DOCUMENTATION" , open list , Roberto Sassu , Petr Tesarik Subject: Re: [PATCH v1 0/8] x86_64 SandBox Mode arch hooks Message-ID: <20240214201415.3dc7d69f@meshulam.tesarici.cz> In-Reply-To: <9EF956AB-DF48-4DAA-AB42-0FBC513ECA22@zytor.com> References: <20240214113516.2307-1-petrtesarik@huaweicloud.com> <34B19756-91D3-4DA1-BE76-BD3122C16E95@zytor.com> <20240214174143.74a4f10c@meshulam.tesarici.cz> <9EF956AB-DF48-4DAA-AB42-0FBC513ECA22@zytor.com> X-Mailer: Claws Mail 4.2.0 (GTK 3.24.39; x86_64-suse-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed, 14 Feb 2024 09:29:06 -0800 "H. Peter Anvin" wrote: > On February 14, 2024 8:41:43 AM PST, "Petr Tesa=C5=99=C3=ADk" wrote: > >On Wed, 14 Feb 2024 07:28:35 -0800 > >"H. Peter Anvin" wrote: > > =20 > >> On February 14, 2024 6:52:53 AM PST, Dave Hansen wrote: =20 > >> >On 2/14/24 03:35, Petr Tesarik wrote: =20 > >> >> This patch series implements x86_64 arch hooks for the generic Sand= Box > >> >> Mode infrastructure. =20 > >> > > >> >I think I'm missing a bit of context here. What does one _do_ with > >> >SandBox Mode? Why is it useful? =20 > >>=20 > >> Seriously. On the surface it looks like a really bad idea =E2=80=93 ba= sically an ad hoc, *more* privileged version of user shave. =20 > > > >Hi hpa, > > > >I agree that it kind of tries to do "user mode without user mode". > >There are some differences from actual user mode: > > > >First, from a process management POV, sandbox mode appears to be > >running in kernel mode. So, there is no way to use ptrace(2), send > >malicious signals or otherwise interact with the sandbox. In fact, > >the process can have three independent contexts: user mode, kernel mode > >and sandbox mode. > > > >Second, a sandbox can run unmodified kernel code and interact directly > >with other parts of the kernel. It's not really possible with this > >initial patch series, but the plan is that sandbox mode can share locks > >with the kernel. > > > >Third, sandbox code can be trusted for operations like parsing keys for > >the trusted keychain if the kernel is locked down, i.e. when even a > >process with UID 0 is not on the same trust level as kernel mode. > > > >HTH > >Petr T > > =20 >=20 > This, to me, seems like "all the downsides of a microkernel without the u= psides." Furthermore, it breaks security-hardening features like LASS and (= to a lesser degree) SMAP. Not to mention dropping global pages? I must be missing something... But I am always open to learn something new. I don't see how it breaks SMAP. Sandbox mode runs in its own address space which does not contain any user-mode pages. While running in sandbox mode, user pages belong to the sandboxed code, kernel pages are used to enter/exit kernel mode. Bottom half of the PGD is empty, all user page translations are removed from TLB. For a similar reason, I don't see right now how it breaks linear address space separation. Even if it did, I believe I can take care of it in the entry/exit path. Anyway, which branch contains the LASS patches now, so I can test? As for dropping global pages, that's only part of the story. Indeed, patch 6/8 of the series sets CR4.PGE to zero to have a known-good working state, but that code is removed again by patch 8/8. I wanted to implement lazy TLB flushing separately, so it can be easily reverted if it is suspected to cause an issue. Plus, each sandbox mode can use PCID to reduce TLB flushing even more. I haven't done it, because it would be a waste of time if the whole concept is scratched. I believe that only those global pages which are actually accessed by the sandbox need to be flushed. Yes, some parts of the necessary logic are missing in the current patch series. I can add them in a v2 series if you wish. > All in all, I cannot see this as anything other than an enormous step in = the wrong direction, and it isn't even in the sense of "it is harmless if n= oone uses it" =E2=80=93 you are introducing architectural changes that are = most definitely *very* harmful both to maintainers and users. I agree that it adds some burden. After all, that's why the ultimate decision is up to you, the maintainers. To defend my cause, I hope you have noticed that if CONFIG_SANDBOX_MODE is not set: 1. literally nothing changes in entry_64. 2. sandbox_mode() always evaluates to false, so the added conditionals in f= ault.c and traps.c are never executed 3. top_of_instr_stack() always returns current_top_of_stack(), which is equ= ivalent to the code it replaces, namely this_cpu_read(pcpu_hot.top_of_stack) So, all the interesting stuff is under arch/x86/kernel/sbm/. Shall I add a corresponding entry with my name to MAINTAINERS? > To me, this feels like paravirtualization all over again. 20 years later = we still have not been able to undo all the damage that did. OK, I can follow you here. Indeed, there is some similarity with Xen PV (running kernel code with CPL 3), but I don't think there's more than this. Petr T