Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp125988ybt; Thu, 25 Jun 2020 17:09:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzIEhNFb09Ql1BaBqzza+ehNIBsYPF5FHQXx0qeyo0wW16RGZFH7/+wFNH+4gb2JaWtrUxE X-Received: by 2002:a05:6402:1153:: with SMTP id g19mr749322edw.127.1593130153973; Thu, 25 Jun 2020 17:09:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593130153; cv=none; d=google.com; s=arc-20160816; b=exaluCIAorLLYFuqIU5tk7NWp5po652LRw0R9eUcaT/skTdLbZuW3t/PEMd4I0qxzG bbxLKD7M8Jv67Gt7SqGVTaZyvG3in5rUEcuVHcbaj0pQxth9QInEdZO3qceK3N7hyxms +hLabSKRt3o8q88ntkm4u052uTBRyJWRwd4yrvoJRlmRRMLOj2DkhsO8Uqbl8CwGaiLG tMM2TV/pGjBQg5tbVDu6LpvaZ+f2ZREiFy/q4KSnP4r7wZKMdCDDyqSKLDk6wJ406Ypy T0lwM7IhPXQKc4aGFNl7+rUtGq2Frgtb2q4n3MJqpmoT9P7Mjaf069a4pot6Kgh54pSd 4FOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:organization:subject:cc:to:from; bh=mIeHhkTzAac4v5C8HclKyPCOkPCRuT5frihwPSIkN+E=; b=Kl7+91IFlugvOSd6T0un/GRjVw48EUj3vohXCoR20FZviqzGHp3gaz7DaECLiN1QEC 5q+PlMsHHq8Wgd8E8pcHFo8Sha5D65UxBu9yMJfQNe4ANmcB9yIjlIT2OKMXynl5Rwet J5f0PbOn1KYUiVoAaLCOLnJ8sdbri6zdvd/Sy3FhmLM0mLmoeixPZdirazAp9uXWhGt1 c9KrzbT8s2UmnCp2tTsoDegQGst9FaMLbjng2twiJ7yBzGa3ith489EfEVgglM5BY4YT DdJaTwX+4jRTSrYeY4MIzGvnEe/mA3s5/awU4ERIX92BcIYhtxtsfAxIq+970B58J4t3 kagQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q2si15505038ejf.33.2020.06.25.17.08.49; Thu, 25 Jun 2020 17:09:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726594AbgFYXtF (ORCPT + 99 others); Thu, 25 Jun 2020 19:49:05 -0400 Received: from bhuna.collabora.co.uk ([46.235.227.227]:42328 "EHLO bhuna.collabora.co.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726398AbgFYXtE (ORCPT ); Thu, 25 Jun 2020 19:49:04 -0400 Received: from [127.0.0.1] (localhost [127.0.0.1]) (Authenticated sender: krisman) with ESMTPSA id 714492A4797 From: Gabriel Krisman Bertazi To: "Robert O'Callahan" Cc: Andy Lutomirski , Linux-MM , open list , kernel@collabora.com, Thomas Gleixner , Kees Cook , Will Drewry , "H . Peter Anvin" , Paul Gofman Subject: Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas Organization: Collabora References: <20200530055953.817666-1-krisman@collabora.com> <85367hkl06.fsf@collabora.com> Date: Thu, 25 Jun 2020 19:48:58 -0400 In-Reply-To: (Robert O'Callahan's message of "Fri, 26 Jun 2020 11:14:56 +1200") Message-ID: <877dvuemfp.fsf@collabora.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org "Robert O'Callahan" writes: > rr (https://rr-project.org, https://arxiv.org/abs/1705.05937) grapples > with a similar problem. We need to intercept commonly-executed system > calls and wrap them with our own processing, with minimal overhead. I > think our basic approach might work for Wine without kernel changes. > > We use SECCOMP_SET_MODE_FILTER with a simple filter that returns > SECCOMP_RET_TRAP on all syscalls except for those called from a single > specific trampoline page (which get SECCOMP_RET_ALLOW). rr ptraces its > children. So, when user-space makes a syscall, the seccomp filter > triggers a ptrace trap. The ptracer looks at the code around the > syscall and if it matches certain common patterns, the ptracer patches > the code with a jump to a stub that does extra work and issues a real > syscall via the trampoline. Thus, each library syscall instruction is > slow the first time and fast every subsequent time. "Weird" syscalls > that the ptracer chooses not to patch do incur the context-switch > penalty every time so their overhead does increase a lot ... but it > sounds like that might be OK in Wine's case? > > A more efficient variant of this approach which would work in some > cases (but maybe not Wine?) would be to avoid using a ptracer and give > the process a SIGSYS handler which does the patching. We couldn't patch Windows code because of the aforementioned DRM and anti-cheat mechanisms, but I suppose this limitation doesn't apply to Wine/native code, and if this assumption is correct, this approach could work. One complexity might be the consistent model for the syscall live patching. I don't know how much of the problem is diminished from the original userspace live-patching problem, but I believe at least part of it applies. And fencing every thread to patch would kill performance. Also, we cannot just patch everything at the beginning. How does rr handle that? Another problem is that we will want to support i386 and other architectures. For int 0x80, it is trickier to encode a branch to another region, given the limited instruction space, and the patching might not be possible in hot paths. I did port libsyscall-intercept to x86-32 once and I could correctly patch glibc, but it's not guaranteed that an updated libc or something else won't break it. I'm not sure the benefit of not needing enhanced kernel support justifies the complexity and performance cost required to make this work reliably, in particular since the semantics for a kernel implementation that we are discussing doesn't seem overly intrusive and might have other applications like in the generic filter Andy mentioned. -- Gabriel Krisman Bertazi