Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp784380pxk; Wed, 23 Sep 2020 16:34:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw1ZJOSu8Kt8N4gXWg9+ZwmLR19q5ltK44HF8MazThy2LaSbSdmxk5Au7XG9Ej7vT0syF3i X-Received: by 2002:a17:906:fb8f:: with SMTP id lr15mr1944719ejb.25.1600904072958; Wed, 23 Sep 2020 16:34:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600904072; cv=none; d=google.com; s=arc-20160816; b=AsQL78Xrj6wIvisIFuMrj7RDbbvQ1pzP7R04cho8BxHVYyjL99PUBC7xus+KwcPigM DWIGT4/SvGiQe0GNBK3+Am6thWRjVOWtRv9/afMhUmgJL9i9uOKRbKXusoKq3EFpabBh 2QdyqohuZQENTgb8PGQPOO/g0Y5eRMt5guo0Hfh5nGDjZQt44MRfUS8hg9In+YCfDUQP G5AhbkqULFdXg+UsY33tXo7aXFdYyLGPi9BHXGfnZxwEo03sy+0635iNbgu4jccgemqL k40Ss76ibsRKfVyrMg3IPUzHfaWk0MGpLA4GfbFKdJ6rSNw3ZGW2C2WH3Ao65rwFkk0K oFkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3tjQcJgxkky4Q5tL6fI5yShZ9VEKJUWBe4Oa7zhVXG8=; b=iWxUC25A7aQcEtU3oRU1aLmI/sDKuanPuAq6MpTqhE4dmW7q49wweyrrv7phH1X4I0 U2iuZ8vEHVY0yI1OMqywpEfBfMLf/C6ipHx3bCO4s1ZQy+GPevTAxNLSFjobDP6kRqS0 P0k5Vqigd53YFgH6IH62Ttzky5EpSJVM6EivwZ4QqcXegkshUZPLGqbycmGmQ2IkSyC8 7hFYm4H2y3j/BkbJLuV6reLi4pp6VsQOTuyf/fdI7+cFxebphy9/EYr6WumpMSvchC7H gsjsVFxUuGtNu3YTyE1qrdF8yTZocqQdqErwsBTgzy6iVRA/+irYPdTt4ihpa5qoOcYj fhlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=TIMKD+uY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mb18si1061965ejb.680.2020.09.23.16.34.10; Wed, 23 Sep 2020 16:34:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=TIMKD+uY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726723AbgIWXbx (ORCPT + 99 others); Wed, 23 Sep 2020 19:31:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726732AbgIWX3a (ORCPT ); Wed, 23 Sep 2020 19:29:30 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE1D8C0613D1 for ; Wed, 23 Sep 2020 16:29:29 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id e4so564943pln.10 for ; Wed, 23 Sep 2020 16:29:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3tjQcJgxkky4Q5tL6fI5yShZ9VEKJUWBe4Oa7zhVXG8=; b=TIMKD+uYvr1YdPLTyJHHVcZYjQEqHy0tYbelapcjRZOHTZmAY0natT6ajDzivxjqj0 d299ZNX8bo0prRnqHPaoQWmTB3QxOMSQUltgTLVgC9Q6lPVc3huZ3ArX+ynawPF0/Q14 qzbrM9jQOYsMsxWgjCGVivEWrJHkrGpS1EtOY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3tjQcJgxkky4Q5tL6fI5yShZ9VEKJUWBe4Oa7zhVXG8=; b=btk+gq4yoeSV67jvh2UAdthbb6UijiDhnNaPFaamJ6jM+7gjMdKqhsQzKCmmEJJuX6 9UQCQUUF00fJ/3VgmUO9Y27eEJbkD0ghC+UPqeZb/EWCyTmYtFK41AMTuayhWuOLg2fP zpVTcX7kk4n9xu1eZvbX1N9t2UtzHyTsaoha7PAZ2yQUfZmBQUeqVEdrnRXfZF77Gz5e VOIs18eoC3dLTOqnW57pBT53N6cwlNTr5XG1qy94CJ4hc83VcG0KrYofsmdmpGRXYM9B kQoDEmRnOqJ6rJviXsugQ18KO73eot+nSDy3ZLafPn3ARHQyggzW/T0+zLH8ARxpom8D 2oAQ== X-Gm-Message-State: AOAM5304LwvKWGQsfdU+nTH+WrMG+542100+gt1kseWIWR9HWcdcfxUw lmOWUDcOQpR2l9uCXoZDUsCO/w== X-Received: by 2002:a17:90a:6848:: with SMTP id e8mr1361792pjm.221.1600903769404; Wed, 23 Sep 2020 16:29:29 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id j13sm492649pjn.14.2020.09.23.16.29.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Sep 2020 16:29:26 -0700 (PDT) From: Kees Cook To: YiFei Zhu Cc: Kees Cook , Jann Horn , Christian Brauner , Tycho Andersen , Andy Lutomirski , Will Drewry , Andrea Arcangeli , Giuseppe Scrivano , Tobin Feldman-Fitzthum , Dimitrios Skarlatos , Valentin Rothberg , Hubertus Franke , Jack Chen , Josep Torrellas , Tianyin Xu , bpf@vger.kernel.org, containers@lists.linux-foundation.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/6] seccomp: Introduce SECCOMP_PIN_ARCHITECTURE Date: Wed, 23 Sep 2020 16:29:18 -0700 Message-Id: <20200923232923.3142503-2-keescook@chromium.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200923232923.3142503-1-keescook@chromium.org> References: <20200923232923.3142503-1-keescook@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org For systems that provide multiple syscall maps based on audit architectures (e.g. AUDIT_ARCH_X86_64 and AUDIT_ARCH_I386 via CONFIG_COMPAT) or via syscall masks (e.g. x86_x32), allow a fast way to pin the process to a specific syscall table, instead of needing to generate all filters with an architecture check as the first filter action. This creates the internal representation that seccomp itself can use (which is separate from the filters, which need to stay runtime agnostic). Additionally paves the way for constant-action bitmaps. Signed-off-by: Kees Cook --- include/linux/seccomp.h | 9 +++ include/uapi/linux/seccomp.h | 1 + kernel/seccomp.c | 79 ++++++++++++++++++- tools/testing/selftests/seccomp/seccomp_bpf.c | 33 ++++++++ 4 files changed, 120 insertions(+), 2 deletions(-) diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h index 02aef2844c38..0be20bc81ea9 100644 --- a/include/linux/seccomp.h +++ b/include/linux/seccomp.h @@ -20,12 +20,18 @@ #include #include +#define SECCOMP_ARCH_IS_NATIVE 1 +#define SECCOMP_ARCH_IS_COMPAT 2 +#define SECCOMP_ARCH_IS_MULTIPLEX 3 +#define SECCOMP_ARCH_IS_UNKNOWN 0xff + struct seccomp_filter; /** * struct seccomp - the state of a seccomp'ed process * * @mode: indicates one of the valid values above for controlled * system calls available to a process. + * @arch: seccomp's internal architecture identifier (not seccomp_data->arch) * @filter: must always point to a valid seccomp-filter or NULL as it is * accessed without locking during system call entry. * @@ -34,6 +40,9 @@ struct seccomp_filter; */ struct seccomp { int mode; +#ifdef SECCOMP_ARCH + u8 arch; +#endif atomic_t filter_count; struct seccomp_filter *filter; }; diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h index 6ba18b82a02e..f4d134ebfa7e 100644 --- a/include/uapi/linux/seccomp.h +++ b/include/uapi/linux/seccomp.h @@ -16,6 +16,7 @@ #define SECCOMP_SET_MODE_FILTER 1 #define SECCOMP_GET_ACTION_AVAIL 2 #define SECCOMP_GET_NOTIF_SIZES 3 +#define SECCOMP_PIN_ARCHITECTURE 4 /* Valid flags for SECCOMP_SET_MODE_FILTER */ #define SECCOMP_FILTER_FLAG_TSYNC (1UL << 0) diff --git a/kernel/seccomp.c b/kernel/seccomp.c index ae6b40cc39f4..0a3ff8eb8aea 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -298,6 +298,47 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen) return 0; } +#ifdef SECCOMP_ARCH +static inline u8 seccomp_get_arch(u32 syscall_arch, u32 syscall_nr) +{ + u8 seccomp_arch; + + switch (syscall_arch) { + case SECCOMP_ARCH: + seccomp_arch = SECCOMP_ARCH_IS_NATIVE; + break; +#ifdef CONFIG_COMPAT + case SECCOMP_ARCH_COMPAT: + seccomp_arch = SECCOMP_ARCH_IS_COMPAT; + break; +#endif + default: + seccomp_arch = SECCOMP_ARCH_IS_UNKNOWN; + } + +#ifdef SECCOMP_MULTIPLEXED_SYSCALL_TABLE_ARCH + if (syscall_arch == SECCOMP_MULTIPLEXED_SYSCALL_TABLE_ARCH) { + seccomp_arch |= (sd->nr & SECCOMP_MULTIPLEXED_SYSCALL_TABLE_MASK) >> + SECCOMP_MULTIPLEXED_SYSCALL_TABLE_SHIFT; + } +#endif + + return seccomp_arch; +} +#endif + +static inline bool seccomp_arch_mismatch(struct seccomp *seccomp, + const struct seccomp_data *sd) +{ +#ifdef SECCOMP_ARCH + /* Block mismatched architectures. */ + if (seccomp->arch && seccomp->arch != seccomp_get_arch(sd->arch, sd->nr)) + return true; +#endif + + return false; +} + /** * seccomp_run_filters - evaluates all seccomp filters against @sd * @sd: optional seccomp data to be passed to filters @@ -312,9 +353,14 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd, struct seccomp_filter **match) { u32 ret = SECCOMP_RET_ALLOW; + struct seccomp_filter *f; + struct seccomp *seccomp = ¤t->seccomp; + + if (seccomp_arch_mismatch(seccomp, sd)) + return SECCOMP_RET_KILL_PROCESS; + /* Make sure cross-thread synced filter points somewhere sane. */ - struct seccomp_filter *f = - READ_ONCE(current->seccomp.filter); + f = READ_ONCE(seccomp->filter); /* Ensure unexpected behavior doesn't result in failing open. */ if (WARN_ON(f == NULL)) @@ -522,6 +568,11 @@ static inline void seccomp_sync_threads(unsigned long flags) if (task_no_new_privs(caller)) task_set_no_new_privs(thread); +#ifdef SECCOMP_ARCH + /* Copy any pinned architecture. */ + thread->seccomp.arch = caller->seccomp.arch; +#endif + /* * Opt the other thread into seccomp if needed. * As threads are considered to be trust-realm @@ -1652,6 +1703,23 @@ static long seccomp_get_notif_sizes(void __user *usizes) return 0; } +static long seccomp_pin_architecture(void) +{ +#ifdef SECCOMP_ARCH + struct task_struct *task = current; + + u8 arch = seccomp_get_arch(syscall_get_arch(task), + syscall_get_nr(task, task_pt_regs(task))); + + /* How did you even get here? */ + if (task->seccomp.arch && task->seccomp.arch != arch) + return -EBUSY; + + task->seccomp.arch = arch; +#endif + return 0; +} + /* Common entry point for both prctl and syscall. */ static long do_seccomp(unsigned int op, unsigned int flags, void __user *uargs) @@ -1673,6 +1741,13 @@ static long do_seccomp(unsigned int op, unsigned int flags, return -EINVAL; return seccomp_get_notif_sizes(uargs); + case SECCOMP_PIN_ARCHITECTURE: + if (flags != 0) + return -EINVAL; + if (uargs != NULL) + return -EINVAL; + + return seccomp_pin_architecture(); default: return -EINVAL; } diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 9c398768553b..d90551e0385e 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -157,6 +157,10 @@ struct seccomp_data { #define SECCOMP_GET_NOTIF_SIZES 3 #endif +#ifndef SECCOMP_PIN_ARCHITECTURE +#define SECCOMP_PIN_ARCHITECTURE 4 +#endif + #ifndef SECCOMP_FILTER_FLAG_TSYNC #define SECCOMP_FILTER_FLAG_TSYNC (1UL << 0) #endif @@ -2221,6 +2225,35 @@ TEST_F_SIGNAL(TRACE_syscall, kill_after, SIGSYS) EXPECT_NE(self->mypid, syscall(__NR_getpid)); } +TEST(seccomp_architecture_pin) +{ + long ret; + + ret = seccomp(SECCOMP_PIN_ARCHITECTURE, 0, NULL); + ASSERT_EQ(0, ret) { + TH_LOG("Kernel does not support SECCOMP_PIN_ARCHITECTURE!"); + } + + /* Make sure unexpected arguments are rejected. */ + ret = seccomp(SECCOMP_PIN_ARCHITECTURE, 1, NULL); + ASSERT_EQ(-1, ret); + EXPECT_EQ(EINVAL, errno) { + TH_LOG("Did not reject SECCOMP_PIN_ARCHITECTURE with flags!"); + } + + ret = seccomp(SECCOMP_PIN_ARCHITECTURE, 0, &ret); + ASSERT_EQ(-1, ret); + EXPECT_EQ(EINVAL, errno) { + TH_LOG("Did not reject SECCOMP_PIN_ARCHITECTURE with address!"); + } + + ret = seccomp(SECCOMP_PIN_ARCHITECTURE, 1, &ret); + ASSERT_EQ(-1, ret); + EXPECT_EQ(EINVAL, errno) { + TH_LOG("Did not reject SECCOMP_PIN_ARCHITECTURE with flags and address!"); + } +} + TEST(seccomp_syscall) { struct sock_filter filter[] = { -- 2.25.1