Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp48839imm; Thu, 27 Sep 2018 15:45:42 -0700 (PDT) X-Google-Smtp-Source: ACcGV61RlVjnrfNtYqFDM0ydPP3P88b6eO+2JZkVq2zlqYPZzfGKWkbuCZ+3nBniRy7gKbgCM9yk X-Received: by 2002:a17:902:274a:: with SMTP id j10-v6mr13382797plg.152.1538088342616; Thu, 27 Sep 2018 15:45:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1538088342; cv=none; d=google.com; s=arc-20160816; b=z7nrkrTpMzIIM6c2GLlWdZck5BdiwC5DHdBaKCyM98xSIY2YDFPzsR5kESz1W8sZ0o P2yRIWgYxwoKuf59DfEHqsPBtFsf20zmN+1pYjM0ATDvMy5Q74Nsua50JJDH7d7nItcI Aqch6twqq6w9AEshbM60al4lvd+pkxiylqltctU2wElM3G+ckn+Iya7xFevyjp/NT8gl 5yL1SAeZfYcKguSOu3vU48dDWdM1lGYCu2c/78oW68chyA5k/mJM4br3F6fo/yayTrym 4qC2RkfHfOzbxsVEwB1f3t9ODhllOlcE46Dzbw2IQTcTwH/Q+GhOy+uzZmlAZB/zoEvo nTfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=j5x/kbi6IdF3I7YOIe2yLXkP6GMZMrWA6raHG+m2Svk=; b=u0dURZsIf/HcrLofk5Ro41f5qlq2eqpu8uz8AJwgF8qrI4jNwKUBun1zeppzQzdm2O Hrg3I85ie6Riedz71GXoPu4lJyKCH1isEDwRm8ElR4eRppfWJBUYs6Ode0S7xSMJYlZx tY5P5FI9khBk7O6WVycjS4qvyOuqVtC6Wp+nQKtvjBnwutlA/hemdoMrgAEK3UOTK6+y lyhitgF5Lq/GmMerbhdbcYI4NKmatugvViV8NQN7zL/Bo9oqq9WrmLESPhji8tj7b9rL ozcsovbQ3K8kATRh1fyLoZoSakjfILCrdnNrKl3UokQjLH1nEdyqQMYNBs/H9OmbUvuK TZOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=NgIK4+NG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f2-v6si3066336pgi.5.2018.09.27.15.45.26; Thu, 27 Sep 2018 15:45:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=NgIK4+NG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727599AbeI1FFw (ORCPT + 99 others); Fri, 28 Sep 2018 01:05:52 -0400 Received: from mail-yb1-f194.google.com ([209.85.219.194]:35464 "EHLO mail-yb1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726043AbeI1FFw (ORCPT ); Fri, 28 Sep 2018 01:05:52 -0400 Received: by mail-yb1-f194.google.com with SMTP id o63-v6so1852134yba.2 for ; Thu, 27 Sep 2018 15:45:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=j5x/kbi6IdF3I7YOIe2yLXkP6GMZMrWA6raHG+m2Svk=; b=NgIK4+NGNz1sgQRRsVtMywCkAVsqYySCJNIp2BMdZDPwntOLHsmM6V8gsnRm281mFr eXzssmy/rmmc5w8zAUzP5aw7rFuVwygkp0Uw5HWEdk0cHz+HrNuuAFOeXBDloH9y66yr fKplF8tGbRcMtuRU9dqboRF6c8HwVqOHMtu1I= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=j5x/kbi6IdF3I7YOIe2yLXkP6GMZMrWA6raHG+m2Svk=; b=J/ouPDv14IiczmTPTITnZ0ZDLlWsJadZQavvec+ZnJewv2Y0EnBaXLyiDDdOoQiVBT fOPGLG4jCU1l1YodlAlwjCnswneqHnbaSEOtS+Gr58QL0u/ZGLCnycPTq0zKm7RClCsT vIvWDNUP3Vjkvbgn/e7caEw5TOUhdyrY8T4gLeC+VjiZAWDB7eIJxXR7pC1Gz7CRMt/6 qLPbGzwNEgv0bDoYi9OZymUN6PPmd6vq6DYC5APeZjIzY/4/gABskReCmrvp6IvDm6ft TEkTBu8UBZ02RWcFbxHQNQ8sTtS4mYGcfny2Kv5+iqXV/ryyKLjZSTkd0TFV7gusq3Ke jbRw== X-Gm-Message-State: ABuFfojIroj9seACnDNH4o7i7nIFhvSU/46CJygkS6QhOCPCngzaCzLR r69uPeOTt1RSB/+t1JR2G8zTGs6/35I= X-Received: by 2002:a25:c64d:: with SMTP id k74-v6mr6937260ybf.62.1538088315173; Thu, 27 Sep 2018 15:45:15 -0700 (PDT) Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com. [209.85.219.170]) by smtp.gmail.com with ESMTPSA id 124-v6sm1298881yws.25.2018.09.27.15.45.13 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 27 Sep 2018 15:45:13 -0700 (PDT) Received: by mail-yb1-f170.google.com with SMTP id o8-v6so1818701ybk.13 for ; Thu, 27 Sep 2018 15:45:13 -0700 (PDT) X-Received: by 2002:a25:19c3:: with SMTP id 186-v6mr7114777ybz.410.1538088312667; Thu, 27 Sep 2018 15:45:12 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a25:d116:0:0:0:0:0 with HTTP; Thu, 27 Sep 2018 15:45:11 -0700 (PDT) In-Reply-To: References: <20180927151119.9989-1-tycho@tycho.ws> <20180927151119.9989-2-tycho@tycho.ws> From: Kees Cook Date: Thu, 27 Sep 2018 15:45:11 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v7 1/6] seccomp: add a return code to trap to userspace To: Jann Horn Cc: Tycho Andersen , Christoph Hellwig , Al Viro , "linux-fsdevel@vger.kernel.org" , kernel list , Linux Containers , Linux API , Andy Lutomirski , Oleg Nesterov , "Eric W. Biederman" , "Serge E. Hallyn" , Christian Brauner , Tyler Hicks , Akihiro Suda Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 27, 2018 at 2:51 PM, Jann Horn wrote: > On Thu, Sep 27, 2018 at 5:11 PM Tycho Andersen wrote: >> However, care should be taken to avoid the TOCTOU >> +mentioned above in this document: all arguments being read from the tracee's >> +memory should be read into the tracer's memory before any policy decisions are >> +made. This allows for an atomic decision on syscall arguments. > > Again, I don't really see how you could get this wrong. Doesn't hurt to mention it, IMO. >> +static long seccomp_notify_send(struct seccomp_filter *filter, >> + unsigned long arg) >> +{ >> + struct seccomp_notif_resp resp = {}; >> + struct seccomp_knotif *knotif = NULL; >> + long ret; >> + u16 size; >> + void __user *buf = (void __user *)arg; >> + >> + if (copy_from_user(&size, buf, sizeof(size))) >> + return -EFAULT; >> + size = min_t(size_t, size, sizeof(resp)); >> + if (copy_from_user(&resp, buf, size)) >> + return -EFAULT; >> + >> + ret = mutex_lock_interruptible(&filter->notify_lock); >> + if (ret < 0) >> + return ret; >> + >> + list_for_each_entry(knotif, &filter->notif->notifications, list) { >> + if (knotif->id == resp.id) >> + break; >> + } >> + >> + if (!knotif || knotif->id != resp.id) { > > Uuuh, this looks unsafe and wrong. I don't think `knotif` can ever be > NULL here. If `filter->notif->notifications` is empty, I think > `knotif` will be `container_of(&filter->notif->notifications, struct > seccom_knotif, list)` - in other words, you'll have a type confusion, > and `knotif` probably points into some random memory in front of > `filter->notif`. > > Am I missing something? Oh, good catch. This just needs to be fixed like it's done in seccomp_notif_recv (separate cur and knotif). >> +static struct file *init_listener(struct task_struct *task, >> + struct seccomp_filter *filter) >> +{ > > Why does this function take a `task` pointer instead of always > accessing `current`? If `task` actually wasn't `current`, I would have > concurrency concerns. A comment in seccomp.h even explains: > > * @filter must only be accessed from the context of current as there > * is no read locking. > > Unless there's a good reason for it, I would prefer it if this > function didn't take a `task` pointer. This is to support PTRACE_SECCOMP_NEW_LISTENER. But you make an excellent point. Even TSYNC expects to operate only on the current thread group. Hmm. While the process is stopped by ptrace, we could, in theory, update task->seccomp.filter via something like TSYNC. So perhaps use: mutex_lock_killable(&task->signal->cred_guard_mutex); before walking the notify_locks? > >> + struct file *ret = ERR_PTR(-EBUSY); >> + struct seccomp_filter *cur, *last_locked = NULL; >> + int filter_nesting = 0; >> + >> + for (cur = task->seccomp.filter; cur; cur = cur->prev) { >> + mutex_lock_nested(&cur->notify_lock, filter_nesting); >> + filter_nesting++; >> + last_locked = cur; >> + if (cur->notif) >> + goto out; >> + } >> + >> + ret = ERR_PTR(-ENOMEM); >> + filter->notif = kzalloc(sizeof(*(filter->notif)), GFP_KERNEL); > > sizeof(struct notification) instead, to make the code clearer? I prefer what Tycho has: I want to allocate an instances of whatever filter->notif is. Though, let's do the kzalloc outside of the locking, instead? >> + ret = anon_inode_getfile("seccomp notify", &seccomp_notify_ops, >> + filter, O_RDWR); >> + if (IS_ERR(ret)) >> + goto out; >> + >> + >> + /* The file has a reference to it now */ >> + __get_seccomp_filter(filter); > > __get_seccomp_filter() has a comment in it that claims "/* Reference > count is bounded by the number of total processes. */". I think this > change invalidates that comment. I think it should be fine to just > remove the comment. Update it to "bounded by total processes and notification listeners"? >> +out: >> + for (cur = task->seccomp.filter; cur; cur = cur->prev) { > > s/; cur;/; 1;/, or use a while loop instead? If the NULL check fires > here, something went very wrong. Hm? This is correct. This is how seccomp_run_filters() walks the list too: struct seccomp_filter *f = READ_ONCE(current->seccomp.filter); ... for (; f; f = f->prev) { Especially if we'll be holding the cred_guard_mutex. -Kees -- Kees Cook Pixel Security