Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp3548489pxb; Wed, 14 Apr 2021 08:01:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzeJzDQyr5YJBmpFqL6DXRv+MZidU6hmul8mkfnoQ6YmEfwTKc8bbBuV44gjnfHKePvpW8s X-Received: by 2002:adf:d20b:: with SMTP id j11mr3607485wrh.292.1618412483622; Wed, 14 Apr 2021 08:01:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618412483; cv=none; d=google.com; s=arc-20160816; b=f/znmuCzwN4x9AifAOk8XGn6y590OBjQyoBS6qmwsSJK3nFWFK5/S0hzLZN4I0I0ya /E8jogU+EwkCitv5lezVw60EmtkA9DlX9YfrNGgC5s7zvXtP+uyS6eK0tZAq9km7TOn2 VkmAC8wyoEWsHWfYnwD1yrDKfv7UkoRrom8vJ4RxpA0zku2IHVJNKriqZvArrxQNUTRc v5R7So3t8geWuGrdfXSDDXExVxPSFcF0wiRPYtsIu6JoOjW7p5Aqp2EmxcqII7+VLftr 2skBlcP0Q/4qZFg74iujdEJaWo38Wtvswwmebhe/sRg/Mr+faajpNYpO5QNWMZxUiFcg Ltfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :date:cc:to:from:subject:message-id; bh=NMzJdefouqvak3FOAey1cWZJllv2YYRyFqtzR+fXA44=; b=JirTLvB8G4iyd9JHqS+k55cdQWlaKrZf+wLw4MXdmznchEcVISNokX/362bornbhmv 3NEt/q0ej51KS0+w46VgdtQtOyF7WciWA+h0r81H09C/V+khpQaxs+PjJCVomEt4OxBY zRq6c5jtHvTeDmLNwpuSEjcOAvDbE/cIMI9aq5jHkYCDkZBkaQYhBW6C1v/P2rWWTeMv TWyAV/9j/UnTkIAbEMI21wZApfzSvUuaro5X87cDLwQs5FCmbKKqTRnArP49Dhbru4C/ DikTzJEDnnTfOl1Z78/+bYIbnS1EuPklzMDw21sorsAApkXYTaaqa6iW2OezRakInIle bYIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d19si13250824ejp.154.2021.04.14.08.00.55; Wed, 14 Apr 2021 08:01:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350408AbhDNJZG (ORCPT + 99 others); Wed, 14 Apr 2021 05:25:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350359AbhDNJYz (ORCPT ); Wed, 14 Apr 2021 05:24:55 -0400 Received: from sipsolutions.net (s3.sipsolutions.net [IPv6:2a01:4f8:191:4433::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49A65C061574; Wed, 14 Apr 2021 02:24:33 -0700 (PDT) Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.94) (envelope-from ) id 1lWbl3-00BZS3-BF; Wed, 14 Apr 2021 11:24:25 +0200 Message-ID: Subject: Re: [PATCH 0/4 POC] Allow executing code and syscalls in another address space From: Benjamin Berg To: Johannes Berg , Anton Ivanov , Andrei Vagin , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org Cc: linux-um@lists.infradead.org, criu@openvz.org, avagin@google.com, Andrew Morton , Andy Lutomirski , Christian Brauner , Dmitry Safonov <0x7f454c46@gmail.com>, Ingo Molnar , Jeff Dike , Mike Rapoport , Michael Kerrisk , Oleg Nesterov , Peter Zijlstra , Richard Weinberger , Thomas Gleixner Date: Wed, 14 Apr 2021 11:24:19 +0200 In-Reply-To: <9f8280540bbc6f3c857ac5749eeafcd145577da3.camel@sipsolutions.net> References: <20210414055217.543246-1-avagin@gmail.com> <78cdee11-1923-595f-90d2-e236efbafa6a@cambridgegreys.com> <9f8280540bbc6f3c857ac5749eeafcd145577da3.camel@sipsolutions.net> Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="=-7RgkmP5TmgdmmPtYS/8k" User-Agent: Evolution 3.38.4 (3.38.4-1.fc33) MIME-Version: 1.0 X-malware-bazaar: not-scanned X-malware-bazaar-2: OK Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-7RgkmP5TmgdmmPtYS/8k Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2021-04-14 at 09:34 +0200, Johannes Berg wrote: > On Wed, 2021-04-14 at 08:22 +0100, Anton Ivanov wrote: > > On 14/04/2021 06:52, Andrei Vagin wrote: > > > We already have process_vm_readv and process_vm_writev to read and > > > write > > > to a process memory faster than we can do this with ptrace. And now > > > it > > > is time for process_vm_exec that allows executing code in an > > > address > > > space of another process. We can do this with ptrace but it is much > > > slower. > > >=20 > > > =3D Use-cases =3D > > >=20 > > > Here are two known use-cases. The first one is =E2=80=9Capplication k= ernel=E2=80=9D > > > sandboxes like User-mode Linux and gVisor. In this case, we have a > > > process that runs the sandbox kernel and a set of stub processes > > > that > > > are used to manage guest address spaces. Guest code is executed in > > > the > > > context of stub processes but all system calls are intercepted and > > > handled in the sandbox kernel. Right now, these sort of sandboxes > > > use > > > PTRACE_SYSEMU to trap system calls, but the process_vm_exec can > > > significantly speed them up. > >=20 > > Certainly interesting, but will require um to rework most of its > > memory=20 > > management and we will most likely need extra mm support to make use > > of=20 > > it in UML. We are not likely to get away just with one syscall there. >=20 > Might help the seccomp mode though: >=20 > https://patchwork.ozlabs.org/project/linux-um/list/?series=3D231980 Hmm, to me it sounds like it replaces both ptrace and seccomp mode while completely avoiding the scheduling overhead that these techniques have. I think everything UML needs is covered: * The new API can do syscalls in the target memory space (we can modify the address space) * The new API can run code until the next syscall happens (or a signal happens, which means SIGALRM for scheduling works) * Single step tracing should work by setting EFLAGS I think the memory management itself stays fundamentally the same. We just do the initial clone() using CLONE_STOPPED. We don't need any stub code/data and we have everything we need to modify the address space and run the userspace process. Benjamin --=-7RgkmP5TmgdmmPtYS/8k Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEED2NO4vMS33W8E4AFq6ZWhpmFY3AFAmB2tMMACgkQq6ZWhpmF Y3DHNw/+OI7BooddkdmPX1QVWLfOOcxDuyD/drcEgE5/m7sjUK4zdG+Va3SlAOC+ Nq8D0N3vCjbjxmebMsNDLS47RIy6OaKrpl0iEZkokNIXVkH0tQrehKNLiKdN20cf Ktu6yNW+F1QT4DC/M9MHVvgRHzxxJb34beVdUXOCPcqKE1fMHAHorHRrJ2Pn6Z+2 whcdWFKkD+k8dPcQ8SV+djNebYqK/8tkc/nnGbi/NXXJ05eLXqDRSgWMDPYQwqsV 5ngFDYVCIqXS21nNjQaw1YbxevY4F58w82LLGoumMygx9VCYs6JCm7eWF2ommPuP DF4OdXD0/JTZOv0bPc7dZgB3YcpkL6KnBSj52Ps7AmVgtF8+pGc/syimY3cGVGsR +2IbShWAFAS5oE04GVc53iRZaqLjO2gryPHqE3QGrcSMZzAxP3F6m6ne7exDvwJO +aNbR3zcJWuFtcIgkhVSIBJRRNyQNsAovZypSuYNgCuJbNk7fYpYW8KEkf6PBhhJ aPLvTOEgreYeKknKl4P2NGpkr/dPjQZtucXQIu5+LflL4fFR61cqi6VskAhuDThj j2Zf8PvWVY5BqpnFSDHa1jucsWsIzthSZbh12NuYi8yRfiDMoUYfoB9dPbm7GfzK n1H6HwrMZzFSPXBGspLWzGGyStY0IqX/r+KpSWmQqcsH8s+5wDQ= =MzgV -----END PGP SIGNATURE----- --=-7RgkmP5TmgdmmPtYS/8k--