Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp624980pxk; Thu, 1 Oct 2020 10:06:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy0CCAZFrh1BE5e+AAhkwKyEx0QrGWZOklhuOJM4xjtFxSz6SUNw6Wh8yrv6qo5sK3aCaIw X-Received: by 2002:aa7:c649:: with SMTP id z9mr9095901edr.12.1601571964569; Thu, 01 Oct 2020 10:06:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601571964; cv=none; d=google.com; s=arc-20160816; b=PycKNTAy+ow22RWKQvH5F3M50aH7nGsrnkEPxvQCWQ1uFPJvvNB38sAmL1jGm6zggr XdwMzRH/jb7LmrnILaqm1pDyUr+DSXdXvPzB49EdcHqj+d3qlXMvb9QMS1t8DOA0WDrp OqBlJxQ6D/Cy1zdR4Tvs0QBVVcrIt9iUgxRdi8F+anBIRy4XbX6z4p9MAtTUmDkBCDZx qUVpNDCzXN+75pgleslJBRAneyMmlTQtl+bUkWs/zKOhCen0xjWUDCoZ5pa5ejyukoaI ctBHppdYYQtceO1Zljo/ZjN51uukuxHUiNDGpS+dbtwd6rf1r1u8kaWIbMzKovv7M0CJ TXRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:dkim-signature; bh=jw8E05t8pF0s/VlD5rMYENGSZUwN/eshe0KaWjZtgSE=; b=1It4gIt7BXFf2XSRYtQvdEQAeHEzh3Zt/AQVMhMj03ePdvj2BVbB2KRvBiDFw5JmUw 5nUHJ6sr8FZK/lTxkTwnyHm+EpMoD6OeC8uOU7YFdvSkEF2WxTzMNgfabbGV04QaGjId v+YzyAUCjwYQ7GgjZaoULBHEYKycv5iccu0SrbtjwH2g1nQ3md7t5xfdOjI77YSogh4a igOurgW+2Gp8NsfUb++d4/XIFQsuPFmADqW+IEDWZjQl3CR00WzRoz+7b6O4Se3j+DHy 4FbsyKLGRsHnuKeb1GhtzFOzpsgvM0bcJlj27rwJ2N1hYtTm2RYCkHoSRy4G8Zzldopb FI/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@tycho.pizza header.s=fm1 header.b=RQKAcECy; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=MrOs6cFr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x9si1752524ejc.668.2020.10.01.10.05.36; Thu, 01 Oct 2020 10:06:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@tycho.pizza header.s=fm1 header.b=RQKAcECy; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=MrOs6cFr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732799AbgJARBW (ORCPT + 99 others); Thu, 1 Oct 2020 13:01:22 -0400 Received: from wnew1-smtp.messagingengine.com ([64.147.123.26]:49693 "EHLO wnew1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732096AbgJAQ71 (ORCPT ); Thu, 1 Oct 2020 12:59:27 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailnew.west.internal (Postfix) with ESMTP id 73129C9A; Thu, 1 Oct 2020 12:58:54 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Thu, 01 Oct 2020 12:58:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tycho.pizza; h= date:from:to:cc:subject:message-id:references:mime-version :content-type:content-transfer-encoding:in-reply-to; s=fm1; bh=j w8E05t8pF0s/VlD5rMYENGSZUwN/eshe0KaWjZtgSE=; b=RQKAcECy89S2+fERA s1+f8jWlq237Tj+3z5yAZW8JrgwJH8Fo8RPQzXtxoPl/bkZRMg8UrBZHR422AUTf zV8Hq6lIb7I3AH7tHIBySIKlV6JjsTkyMDuEvktVFDAR560enk/NRYpR/h7jyMU+ CGmzBQG7OLptq8MOh7acEy7gyykqydAZDLwFw88eVK99+QaEauWs5coa3WwSoSJp UaN5hmefFUkOk4DyiGx0haS850p+/Cvv3LUZGbD1Yswk7cUC4rk9dMcw72OXMbxZ Pz2pQJ42nKXlHxGrtYeqhcxOHxSVJ+lpKJnxNWZqnvR745QKIXB5M/94JG3huNKW c6E/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=jw8E05t8pF0s/VlD5rMYENGSZUwN/eshe0KaWjZtg SE=; b=MrOs6cFrxutvZegwyYNltSEHDLdOSjpwXMiLacNBvZjMrDNbnzFSTmOn1 6ZfnsPjUePtwG5ESSQ1bXy53R1NMaW3umbaWYXVOI2fOLSmwvfsXbfnofenGAZUr 3r3WrNw3kCZujuT3R6SZ9j5YGasf6uKJOA9w/QatkRjxTW9hAsdn7iDjnEOREbdH 3NUBzRi4meYGUBTdsIsM8g0+DIbnrecns5/ek1UUBn+K10l462ltJdu/Q8pkO3Qw I+Dh0l3aEFg2hndYc8kWMkYOMsvGPEyKDoPbeqKgvaigWkbeaKUdg+e8FwQ3RYDG womaxI3t06K8R8lRekzY5WOJVnN1w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrfeeggdduuddtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvffukfhfgggtugfgjgesthekredttddtjeenucfhrhhomhepvfihtghh ohcutehnuggvrhhsvghnuceothihtghhohesthihtghhohdrphhiiiiirgeqnecuggftrf grthhtvghrnhephfeuvddvleeiveeggeejueekueeljedtjeefteefueejfedvledttefh hfeukeffnecukfhppeejfedrvddujedruddtrdeitdenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehthigthhhosehthigthhhordhpihiiiigr X-ME-Proxy: Received: from cisco (c-73-217-10-60.hsd1.co.comcast.net [73.217.10.60]) by mail.messagingengine.com (Postfix) with ESMTPA id 6A5E33280064; Thu, 1 Oct 2020 12:58:51 -0400 (EDT) Date: Thu, 1 Oct 2020 10:58:50 -0600 From: Tycho Andersen To: Jann Horn Cc: Christian Brauner , linux-man , Song Liu , Will Drewry , Kees Cook , Daniel Borkmann , Giuseppe Scrivano , Robert Sesek , Linux Containers , lkml , Alexei Starovoitov , "Michael Kerrisk (man-pages)" , bpf , Andy Lutomirski , Christian Brauner Subject: Re: For review: seccomp_user_notif(2) manual page Message-ID: <20201001165850.GC1260245@cisco> References: <45f07f17-18b6-d187-0914-6f341fe90857@gmail.com> <20201001125043.dj6taeieatpw3a4w@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 01, 2020 at 05:47:54PM +0200, Jann Horn via Containers wrote: > On Thu, Oct 1, 2020 at 2:54 PM Christian Brauner > wrote: > > On Wed, Sep 30, 2020 at 05:53:46PM +0200, Jann Horn via Containers wrote: > > > On Wed, Sep 30, 2020 at 1:07 PM Michael Kerrisk (man-pages) > > > wrote: > > > > NOTES > > > > The file descriptor returned when seccomp(2) is employed with the > > > > SECCOMP_FILTER_FLAG_NEW_LISTENER flag can be monitored using > > > > poll(2), epoll(7), and select(2). When a notification is pend‐ > > > > ing, these interfaces indicate that the file descriptor is read‐ > > > > able. > > > > > > We should probably also point out somewhere that, as > > > include/uapi/linux/seccomp.h says: > > > > > > * Similar precautions should be applied when stacking SECCOMP_RET_USER_NOTIF > > > * or SECCOMP_RET_TRACE. For SECCOMP_RET_USER_NOTIF filters acting on the > > > * same syscall, the most recently added filter takes precedence. This means > > > * that the new SECCOMP_RET_USER_NOTIF filter can override any > > > * SECCOMP_IOCTL_NOTIF_SEND from earlier filters, essentially allowing all > > > * such filtered syscalls to be executed by sending the response > > > * SECCOMP_USER_NOTIF_FLAG_CONTINUE. Note that SECCOMP_RET_TRACE can equally > > > * be overriden by SECCOMP_USER_NOTIF_FLAG_CONTINUE. > > > > > > In other words, from a security perspective, you must assume that the > > > target process can bypass any SECCOMP_RET_USER_NOTIF (or > > > SECCOMP_RET_TRACE) filters unless it is completely prohibited from > > > calling seccomp(). This should also be noted over in the main > > > seccomp(2) manpage, especially the SECCOMP_RET_TRACE part. > > > > So I was actually wondering about this when I skimmed this and a while > > ago but forgot about this again... Afaict, you can only ever load a > > single filter with SECCOMP_FILTER_FLAG_NEW_LISTENER set. If there > > already is a filter with the SECCOMP_FILTER_FLAG_NEW_LISTENER property > > in the tasks filter hierarchy then the kernel will refuse to load a new > > one? > > > > static struct file *init_listener(struct seccomp_filter *filter) > > { > > struct file *ret = ERR_PTR(-EBUSY); > > struct seccomp_filter *cur; > > > > for (cur = current->seccomp.filter; cur; cur = cur->prev) { > > if (cur->notif) > > goto out; > > } > > > > shouldn't that be sufficient to guarantee that USER_NOTIF filters can't > > override each other for the same task simply because there can only ever > > be a single one? > > Good point. Exceeeept that that check seems ineffective because this > happens before we take the locks that guard against TSYNC, and also > before we decide to which existing filter we want to chain the new > filter. So if two threads race with TSYNC, I think they'll be able to > chain two filters with listeners together. Yep, seems the check needs to also be in seccomp_can_sync_threads() to be totally effective, > I don't know whether we want to eternalize this "only one listener > across all the filters" restriction in the manpage though, or whether > the man page should just say that the kernel currently doesn't support > it but that security-wise you should assume that it might at some > point. This requirement originally came from Andy, arguing that the semantics of this were/are confusing, which still makes sense to me. Perhaps we should do something like the below? Tycho diff --git a/kernel/seccomp.c b/kernel/seccomp.c index 3ee59ce0a323..7b107207c2b0 100644 --- a/kernel/seccomp.c +++ b/kernel/seccomp.c @@ -376,6 +376,18 @@ static int is_ancestor(struct seccomp_filter *parent, return 0; } +static bool has_listener_parent(struct seccomp_filter *child) +{ + struct seccomp_filter *cur; + + for (cur = current->seccomp.filter; cur; cur = cur->prev) { + if (cur->notif) + return true; + } + + return false; +} + /** * seccomp_can_sync_threads: checks if all threads can be synchronized * @@ -385,7 +397,7 @@ static int is_ancestor(struct seccomp_filter *parent, * either not in the correct seccomp mode or did not have an ancestral * seccomp filter. */ -static inline pid_t seccomp_can_sync_threads(void) +static inline pid_t seccomp_can_sync_threads(unsigned int flags) { struct task_struct *thread, *caller; @@ -407,6 +419,11 @@ static inline pid_t seccomp_can_sync_threads(void) caller->seccomp.filter))) continue; + /* don't allow TSYNC to install multiple listeners */ + if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER && + !has_listener_parent(thread->seccomp.filter)) + continue; + /* Return the first thread that cannot be synchronized. */ failed = task_pid_vnr(thread); /* If the pid cannot be resolved, then return -ESRCH */ @@ -637,7 +654,7 @@ static long seccomp_attach_filter(unsigned int flags, if (flags & SECCOMP_FILTER_FLAG_TSYNC) { int ret; - ret = seccomp_can_sync_threads(); + ret = seccomp_can_sync_threads(flags); if (ret) { if (flags & SECCOMP_FILTER_FLAG_TSYNC_ESRCH) return -ESRCH; @@ -1462,12 +1479,9 @@ static const struct file_operations seccomp_notify_ops = { static struct file *init_listener(struct seccomp_filter *filter) { struct file *ret = ERR_PTR(-EBUSY); - struct seccomp_filter *cur; - for (cur = current->seccomp.filter; cur; cur = cur->prev) { - if (cur->notif) - goto out; - } + if (has_listener_parent(current->seccomp.filter)) + goto out; ret = ERR_PTR(-ENOMEM); filter->notif = kzalloc(sizeof(*(filter->notif)), GFP_KERNEL);