Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3323377pxk; Mon, 7 Sep 2020 09:31:45 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyPa6duxcEbJ97L5wre8nKbUCq3oHV+xiFz3qV2TCAGKok+5VkdaXV/4kJEdhl2WfLWbcCR X-Received: by 2002:a17:906:2cc2:: with SMTP id r2mr21410511ejr.482.1599496305549; Mon, 07 Sep 2020 09:31:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1599496305; cv=none; d=google.com; s=arc-20160816; b=WOS8v+pvxA8dCGgdvyt9FKys7ayMU27r3Gu31Ozh3U/C58WoFEal60aHd8YISenouc Kr+ck1wJUaMORpn8Sp0TB/GHQDxm2QzzmpKZGKwsihbLRnqVsjSYnhcSYpFwNmsr2M7W J2dgwhjMtTQ2RDfrGEayynnTkGohXVvzlOromhF1wjo9VOba/uK1dqPdf9UH5Vy5GVcx OQlNY5ZvDQjDpmPTf80hGE7XkXcHzat8cbB+RZj4rp7UKPJ72FXJFJIBT0OxGmjSEao/ PzLdN1u/UAuZwfYqCF6GrhJj48Owl+H8z9FTWjPPT36WKpeQXae+zNmJOpn439DJDufo bfUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=eRsSKYsqqdGuQ2Rpj6V1IEtrXqPcB3DSAMKC6FfdjDU=; b=RRrNI22Nw8O99RCAjqhu94YEjnSFXhKLPH4YSzDIp80bZhMg9OAXdnChOxAGryUxqf 8xdSN10U5kBlPeNg49izVrCq1UzSyr+VV0bRF5i18Efjoo2adpg5jt+bJ0oaDUP2bDA+ 6gNrVjxWaJAN4Cl8Wtb82IXRpSvwsg6FzSHTl3wC27tiIcM3CDG63tZkgVZun6Z6Fp51 deqfRULyzpxUEKgx9OUcf/R2z+mM65Ux8FcZfC7I5EPuuLIRPw26q1chNdGEQohVX9Nj ziRlrkZwvk+O4sq0x51ezx4e7+ew8m/0dnRRAA7HErK9ZVxPwIeMUe52wZeUFBv6ZXpd M64g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id co28si10075690edb.562.2020.09.07.09.31.22; Mon, 07 Sep 2020 09:31:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729930AbgIGQar (ORCPT + 99 others); Mon, 7 Sep 2020 12:30:47 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:39161 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729543AbgIGOZY (ORCPT ); Mon, 7 Sep 2020 10:25:24 -0400 Received: from ip5f5af70b.dynamic.kabel-deutschland.de ([95.90.247.11] helo=wittgenstein) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kFI51-0001oo-Gp; Mon, 07 Sep 2020 14:25:11 +0000 Date: Mon, 7 Sep 2020 16:25:10 +0200 From: Christian Brauner To: Andy Lutomirski Cc: Gabriel Krisman Bertazi , luto@kernel.org, tglx@linutronix.de, keescook@chromium.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, willy@infradead.org, linux-kselftest@vger.kernel.org, shuah@kernel.org, kernel@collabora.com Subject: Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry Message-ID: <20200907142510.klojh2urwyui23ox@wittgenstein> References: <20200907101522.zo6qzgp4qfzkz7cs@wittgenstein> <0639209E-B6C6-4F86-84F4-04B91E1CC8AA@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0639209E-B6C6-4F86-84F4-04B91E1CC8AA@amacapital.net> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 07, 2020 at 07:15:52AM -0700, Andy Lutomirski wrote: > > > > On Sep 7, 2020, at 3:15 AM, Christian Brauner wrote: > > > > On Fri, Sep 04, 2020 at 04:31:44PM -0400, Gabriel Krisman Bertazi wrote: > >> Syscall User Dispatch (SUD) must take precedence over seccomp, since the > >> use case is emulation (it can be invoked with a different ABI) such that > >> seccomp filtering by syscall number doesn't make sense in the first > >> place. In addition, either the syscall is dispatched back to userspace, > >> in which case there is no resource for seccomp to protect, or the > > > > Tbh, I'm torn here. I'm not a super clever attacker but it feels to me > > that this is still at least a clever way to circumvent a seccomp > > sandbox. > > If I'd be confined by a seccomp profile that would cause me to be > > SIGKILLed when I try do open() I could prctl() myself to do user > > dispatch to prevent that from happening, no? > > > > Not really, I think. The idea is that you didn’t actually do open(). > You did a SYSCALL instruction which meant something else, and the > syscall dispatch correctly prevented the kernel from misinterpreting > it as open(). Right, for the case where you're e.g. emulating windows syscalls that's true. I was thinking when you're running natively on Linux: couldn't I first load a seccomp profile "kill me if someone does an open()", then I exec() the target binary and that binary is setup to do prctl(USER_DISPATCH) first thing. I guess, it's ok because as far as I had time to read it this is a nothing or all mechanism, i.e. _all_ system calls are re-routed in contrast to e.g. seccomp where I could do this per-syscall. So for user-dispatch it wouldn't make sense to use it on Linux per se. Still makes me a little uneasy. :) Christian