Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp2356349ybg; Thu, 24 Oct 2019 08:31:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqzJHrLEmkIPP8BOBxm8AyHRKAYsYvkD8PWOL+4QA6jqtyvtHr8sAdmG/8c6TZv3XTxBWwe4 X-Received: by 2002:a50:ac1c:: with SMTP id v28mr2368200edc.156.1571931074712; Thu, 24 Oct 2019 08:31:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571931074; cv=none; d=google.com; s=arc-20160816; b=HeOtAwcylAZir8aK/L2GrUWq1DHXx+Sp7NkhzCXXWDhBd6ID7XaOjMVqSkkMp4+EBx sz2MWW5ppeMRfwNNQrEalBawujBR5VMaRy5moWnjIC53DSJTBw4PUIFXJq32ZonOgl74 5CF3wGln+o1NX3aersYwUFXEEeXYpCP2RQxlGRLzvLxWQBdra4dHA9tlTpVIT4/IxWo5 6+RmI9Brl83aghZzeyw+QoIaONz0IcCoVPXDyX5yE+AH/jWksaK4IhAeLv57R5qqbZPB 8XCyy19zdOPASnZnR+KKathM5UDA0n3h4ycYQit+JXDTJ5IbyJBbIgvu38Sr84VjU+IE ybKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=vuVzLAEcRCFhrcgSos3qAeBFzHlAS5lqE3Svr2BKMmM=; b=om5JLLZO5PI/rHIKqzhhdF2IgmTltMNVtCATa1QgL1Wj5aTPrU9r6/XWUynHkn8IL2 ZQdieltlmn8UTCMo4mrfWIWN31umH097Kjw0qAkLE5/GnvgzdqWKG4kUbOLCqKB9igs1 R/lHhj50Fsg8fVthAMDf2qaL6JGBd5yDwiOPXzOidV0xNq9e4jJe3faU4Z1LkFAz6Sod vhgVOrTdMsXPU9rmkkvm9cQRZbW2agkLygNwtHUw1GbK27l4dsHnZc8tqnnxULH5pCzm vh+tiv6DL9zTUfjqRGMaaSPWrcqta1lUEL5BKO0Rcmj1HdxuCIn+Z6U+eSnCOarIwdtd /dfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=1V9uM3+P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l9si13884280ejq.164.2019.10.24.08.30.48; Thu, 24 Oct 2019 08:31:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=1V9uM3+P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2392148AbfJWVZv (ORCPT + 99 others); Wed, 23 Oct 2019 17:25:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:41466 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2407499AbfJWVZu (ORCPT ); Wed, 23 Oct 2019 17:25:50 -0400 Received: from mail-wm1-f48.google.com (mail-wm1-f48.google.com [209.85.128.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AEEDB21D7E for ; Wed, 23 Oct 2019 21:25:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1571865949; bh=YaJ5EMettNl2MRhvjPY7Zn9r4Nj39GyLbVZCpxRItLw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=1V9uM3+PyveNNb8Jwxgm5uovWimS6VRLFaIBFVUjKmtPMB6n0DHkEGMqnkVDoprzm EUbOAXwqhZzveNbA/trpSKy3kiAfi7dRBAJcZVwIxROZnPo/BkcSRO1P9DbsRAgsYz qb9+fM1ZjoJc1yKItbXBngxbx2ms6V72bhlRSZTU= Received: by mail-wm1-f48.google.com with SMTP id c22so411113wmd.1 for ; Wed, 23 Oct 2019 14:25:48 -0700 (PDT) X-Gm-Message-State: APjAAAWCukSqG7v1iiTZ4KmcM9y+ETJtIrRwY6WM3INP4BILugbvffHU pkEEvfTfXfo4g0yDv8d2zhLSDsPddXwx7HskhJliRw== X-Received: by 2002:a7b:c74a:: with SMTP id w10mr1591999wmk.173.1571865947075; Wed, 23 Oct 2019 14:25:47 -0700 (PDT) MIME-Version: 1.0 References: <20191012191602.45649-1-dancol@google.com> <20191012191602.45649-4-dancol@google.com> <20191023190959.GA9902@redhat.com> <20191023211645.GC9902@redhat.com> In-Reply-To: <20191023211645.GC9902@redhat.com> From: Andy Lutomirski Date: Wed, 23 Oct 2019 14:25:35 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API. To: Andrea Arcangeli Cc: Andy Lutomirski , Jann Horn , Daniel Colascione , Linus Torvalds , Pavel Emelyanov , Lokesh Gidra , Nick Kralevich , Nosh Minwalla , Tim Murray , Mike Rapoport , Linux API , LKML , "Dr. David Alan Gilbert" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 23, 2019 at 2:16 PM Andrea Arcangeli wrote: > > On Wed, Oct 23, 2019 at 12:21:18PM -0700, Andy Lutomirski wrote: > > There are two things going on here. > > > > 1. Daniel wants to add LSM labels to userfaultfd objects. This seems > > reasonable to me. The question, as I understand it, is: who is the > > subject that creates a uffd referring to a forked child? I'm sure > > this is solvable in any number of straightforward ways, but I think > > it's less important than: > > The new uffd created during fork would definitely need to be accounted > on the criu monitor, nor to the parent nor the child, so it'd need to > be accounted to the process/context that has the fd in its file > descriptors array. But since this is less important let's ignore this > for a second. > > > 2. The existing ABI is busted independently of #1. Suppose you call > > userfaultfd to get a userfaultfd and enable UFFD_FEATURE_EVENT_FORK. > > Then you do: > > > > $ sudo <&[userfaultfd number] > > > > Sudo will read it and get a new fd unexpectedly added to its fd table. > > It's worse if SCM_RIGHTS is involved. > > So the problem is just that a new fd is created. So for this to turn > out to a practical issue, it requires finding a reckless suid that > won't even bother checking the return value of the open/socket > syscalls or some equivalent fd number related side effect. All right > that makes more sense now and of course I agree it needs fixing. Or it requires a long-lived daemon that receives fds over SCM_RIGHTS and reads from them. > > > So I think we either need to declare that UFFD_FEATURE_EVENT_FORK is > > only usable by global root or we need to remove it and maybe re-add it > > in some other form. > > If I had a time machine, I'd rather prefer to do the below: > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index fe6d804a38dc..574062051678 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1958,7 +1958,7 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) > return -ENOMEM; > > refcount_set(&ctx->refcount, 1); > - ctx->flags = flags; > + ctx->flags = flags | UFFD_CLOEXEC; That doesn't solve the problem. With your time machine, you should instead use ioctl() or recvmsg(). > > 4) enforce the global root permission check when creating the uffd only if > UFFD_FEATURE_EVENT_FORK is set. This could work, but we should also add a better way to do UFFD_FEATURE_EVENT_FORK and get CRIU to start using it. If CRIU is the only user, we can probably drop the old ABI after a couple of releases, since as far as I know, CRIU users need to upgrade their CRIU more or less in sync with the kernel so that new kernel features get checkpointed and restored.