Received: by 2002:a05:7412:2a91:b0:fc:a2b0:25d7 with SMTP id u17csp626347rdh; Wed, 14 Feb 2024 07:05:05 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUwn+e3apwUgjdIE/ivp0mBtDQToeOtXUfs/wZ5Nx4TSH5CYSZHyjw/U3hczfy5PfRCDENRlsNsH0dj8EC0IsEvd4NImHpvU9YE372JpA== X-Google-Smtp-Source: AGHT+IFcmuHVk1CqZH7RNv1/bKYbjgVblmkyoMEDCbE+yE3KwxcUW3yerjXfDqX0BUuALrP5Nrhp X-Received: by 2002:ac8:5713:0:b0:42c:7dc5:c78 with SMTP id 19-20020ac85713000000b0042c7dc50c78mr3138368qtw.20.1707923105301; Wed, 14 Feb 2024 07:05:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707923105; cv=pass; d=google.com; s=arc-20160816; b=JXfLWDA23Z/aDTIT4h14nYINmGY9sxhtX5SG1ojR+CL8091M8IixrRZv2eGXQZYtJs /D4wi2OqkftPThE3+II5q1o+2iWTwrDvAwccGJQwoBWHB5e5XnQ1Qktgs9jcB92gmeOu awgxfnXbOSyfvgYaXQyUBEgMn55cZqIMQ9T61hQmJlg/YzY3YI14k2mngHDlJVefGDJM hDNDSSUQafKL3pL0WA7GCLpZye+QgoEA626IeuAoS6nF9yFeIw35vZN1AK0SszO6Ajlz +4fGM+W/es5cQyqnjxQYxIjdEWvYEbo3wNAbYPH17SqatE7Qd52iwRcKesM7rU/+ar8F C49A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=V/LB2qopI6o0g1XHu1b29Ik8gPX5TgEGhvR0Q5GGVzU=; fh=iorthZwsLX8r9Haqqf2XsjGJzT4gCx7a1KATxicO4rc=; b=AEPyBFFyn9t3mLEqwWJkSuuk7FZPC/FhN74dQfPYCxu644+V0GNjXL0t4y9Ru0f3f5 U3saqX/zu6vBY8rZjxiWTFZWWiBc2ScoR9kBXdROLOv/W4hfBgCwBQciDjuymSq7lrfd YmVfpxloVzq1wLyy7nq4zVMHN+x0/to+tijM2wU0dkzyhCOO4xfrc7BZIlZhncWWT1lf c8cy5eBQ7cJa9mhhhJnsuJtJPcxNa29P0BE11Ke7MYCbfzA11v4Hwl2WstjbFOXnHb/k m8HOn5uslSuAZF1rt5IAZjnDsNmzLN+xG+XW91M6cAZW1N3hqsFiu6bsjjWbM2LNO0ZO vcZQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=rfkPfXKF; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-65377-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65377-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz X-Forwarded-Encrypted: i=2; AJvYcCU+0BcbcmZ1U+EM1sU76n06oVpHnBR/QM52KJsd4Mr19VrrMLPvCIz7nKNqwhNlTicmv7jm5UekLmZOGhGxigBYYVveH1vlNb/6udQE8w== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id a10-20020a05622a02ca00b0042c7fd0ffd9si6003255qtx.183.2024.02.14.07.05.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 07:05:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-65377-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@tesarici.cz header.s=mail header.b=rfkPfXKF; arc=pass (i=1 spf=pass spfdomain=tesarici.cz dkim=pass dkdomain=tesarici.cz dmarc=pass fromdomain=tesarici.cz); spf=pass (google.com: domain of linux-kernel+bounces-65377-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65377-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=tesarici.cz Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 596C41C283F0 for ; Wed, 14 Feb 2024 15:02:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 474675BAFE; Wed, 14 Feb 2024 15:01:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b="rfkPfXKF" Received: from bee.tesarici.cz (bee.tesarici.cz [77.93.223.253]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 69A1A5A4CE; Wed, 14 Feb 2024 15:01:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=77.93.223.253 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707922916; cv=none; b=hJEuATabd04O8Ohj5WIS5zCKxnZFD5CdFjywPI2COKWkxCH06mT5rKXcWOjbF6xiHwL3Msx5Z3fi5Z8Thva/hz0oiP/9WZ0XWOMneqkUgxKqAARX/PR2ZHhuoPqWMcaWNM2WVuPXyYoI+w0YPveiPgQ1uzPtjOz+22D05hjQeDI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707922916; c=relaxed/simple; bh=MA5CG8ZPxr9j61ieXNWQg0zI2IXSl8QLuVz7JxDklWs=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SJ7AxJ2QJzkTwmd69k8ppknZDtNodST+mnY2t8DTxvyFc3SUFs8f/BXhw6i8iJDMrZzZF5BQ2gx7wlWYasGVdb+9JE/FkgtG4cKBE6wWsdug6yGFPXkGOp6A71QVsHeB93bKQFBvjEkl88pNftsk7Jppg4wlmkcLzbDxD2nM3qo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz; spf=pass smtp.mailfrom=tesarici.cz; dkim=pass (2048-bit key) header.d=tesarici.cz header.i=@tesarici.cz header.b=rfkPfXKF; arc=none smtp.client-ip=77.93.223.253 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=tesarici.cz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=tesarici.cz Received: from meshulam.tesarici.cz (dynamic-2a00-1028-83b8-1e7a-4427-cc85-6706-c595.ipv6.o2.cz [IPv6:2a00:1028:83b8:1e7a:4427:cc85:6706:c595]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bee.tesarici.cz (Postfix) with ESMTPSA id F23371A2C5A; Wed, 14 Feb 2024 15:55:39 +0100 (CET) Authentication-Results: mail.tesarici.cz; dmarc=fail (p=quarantine dis=none) header.from=tesarici.cz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tesarici.cz; s=mail; t=1707922540; bh=V/LB2qopI6o0g1XHu1b29Ik8gPX5TgEGhvR0Q5GGVzU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=rfkPfXKF4zD9tEikgRhMIL7jn5TadE/e60ZNGiLbEq6GXy6UUJCTRyfq/6Vgxk41O su435zOm7YiVhp33zlKa/OEP5Z3MA3F0XUteo3Hm1dMTkLXk6YX1azAY6/bYuABrTB f7BHrFJaIO3hx0Sj17P02LiETWNlm1zJQ04wytp4f4KRlFub1oz5F3jGkvAH8U8qWI GO8E9jHus4SYO/2liMinJBl5a8UF9t4jOzdIvEgS/YXy+D+DIoinDmNrGd9trXO4zT Ey0Cc5O8eSlGLGsVnZ9jsaC6i2UCOVNdRh/VrihmIfWCJ/JwPZ/YtL2dC604WIF0Vg exKxYgVxi93aA== Date: Wed, 14 Feb 2024 15:55:24 +0100 From: Petr =?UTF-8?B?VGVzYcWZw61r?= To: Greg Kroah-Hartman Cc: Andrew Morton , Petr Tesarik , Jonathan Corbet , David Kaplan , Larry Dewey , Elena Reshetova , Carlos Bilbao , "Masami Hiramatsu (Google)" , Randy Dunlap , Petr Mladek , "Paul E. McKenney" , Eric DeVolder , Marc =?UTF-8?B?QXVyw6hsZQ==?= La France , "Gustavo A. R. Silva" , Nhat Pham , "Christian Brauner (Microsoft)" , Douglas Anderson , Luis Chamberlain , Guenter Roeck , Mike Christie , Kent Overstreet , Maninder Singh , "open list:DOCUMENTATION" , open list , Roberto Sassu , Petr Tesarik Subject: Re: [PATCH v1 5/5] sbm: SandBox Mode documentation Message-ID: <20240214155524.719ffb15@meshulam.tesarici.cz> In-Reply-To: <2024021425-audition-expand-2901@gregkh> References: <20240214113035.2117-1-petrtesarik@huaweicloud.com> <20240214113035.2117-6-petrtesarik@huaweicloud.com> <20240214053053.982b48d993ae99dad1d59020@linux-foundation.org> <2024021425-audition-expand-2901@gregkh> X-Mailer: Claws Mail 4.2.0 (GTK 3.24.39; x86_64-suse-linux-gnu) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Wed, 14 Feb 2024 15:01:25 +0100 Greg Kroah-Hartman wrote: > On Wed, Feb 14, 2024 at 05:30:53AM -0800, Andrew Morton wrote: > > On Wed, 14 Feb 2024 12:30:35 +0100 Petr Tesarik wrote: > > > > > +Although data structures are not serialized and deserialized between kernel > > > +mode and sandbox mode, all directly and indirectly referenced data structures > > > +must be explicitly mapped into the sandbox, which requires some manual effort. > > > > Maybe I'm missing something here, but... > > > > The requirement that the sandboxed function only ever touch two linear > > blocks of memory (yes?) seems a tremendous limitation. I mean, how can > > the sandboxed function call kmalloc()? How can it call any useful > > kernel functions? They'll all touch memory which lies outside the > > sandbox areas? > > > > Perhaps a simple but real-world example would help clarify. > > I agree, this looks like an "interesting" framework, but we don't add > code to the kernel without a real, in-kernel user for it. > > Without such a thing, we can't even consider it for inclusion as we > don't know how it will actually work and how any subsystem would use it. > > Petr, do you have an user for this today? Hi Greg & Andrew, your observations is correct. In this form, the framework is quite limited, and exactly this objections was expected. You have even spotted one of the first enhancements I tested on top of this framework (dynamic memory allocation). The intended use case is code that processes untrusted data that is not properly sanitized, but where performance is not critical. Some examples include decompressing initramfs, loading a kernel module. Or decoding a boot logo; I think I've noticed a vulnerability in another project recently... ;-) Of course, even decompression needs dynamic memory. My plan is to extend the mechanism. Right now I'm mapping all of kernel text into the sandbox. Later, I'd like to decompose the text section too. The pages which contain sandboxed code should be mapped, but rest of the kernel should not. If the sandbox tries to call kmalloc(), vmalloc(), or schedule(), the attempt will generate a page fault. Sandbox page faults are already intercepted, so handle_sbm_call() can decide if the call should be allowed or not. If the sandbox policy says ALLOW, the page fault handler will perform the API call on behalf of the sandboxed code and return results, possibly with some post-call action, e.g. map some more pages to the address space. The fact that all communication with the rest of the kernel happens through CPU exceptions is the reason this mechanism is unsuitable for performance-critical applications. OK, so why didn't I send the whole thing? Decomposition of the kernel requires many more changes, e.g. in linker scripts. Some of them depend on this patch series. Before I go and clean up my code into something that can be submitted, I want to get feedback from guys like you, to know if the whole idea would be even considered, aka "Fail Fast". Petr T