Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp731280ybt; Fri, 10 Jul 2020 10:54:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzZY4908HcvjlNaJlN26ibW6pJSzaq67HhW8iycPiu08ln2cp9/FR4b7qCQG8uWoziV3pDG X-Received: by 2002:a17:906:ca57:: with SMTP id jx23mr59478880ejb.256.1594403648445; Fri, 10 Jul 2020 10:54:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594403648; cv=none; d=google.com; s=arc-20160816; b=X2L2X2gk6lCNlliYCXBvXrVv9MGPoKCAyH9BddHZ4ZZsIWZMXmIF58z7hBTSgJ0PPj Shu9Nx/UACiEVU+YOORrOXZaBbiMf0D/r2QODyWLFNT3eMMxJlhKA/wD7CgjGIoCJmYD 8PXlkcWL/XWyNPaCZgJXiAeFXdt6FR0spsGK23E+ytI4oMkeUMGMUxhKt9X6Wfyi7Sh5 gX8vhhBrO8oLwhdfajMjQcnHoiVWF9/z6bVo/CPBzhnEWsR4AoXyJuhMbzK+QlAsg62k N4MOaLLrxmsgPszjSKLVP1WkK8RQl6qd4bXA0slW8LBzDhWhwlusccfHQqwK5B6BdXwt BUvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=X4sn28XQAInZ6oRA/4Miu/453HHyk6jIfO1A6nENhso=; b=AjnFpzueJq9lCaR0ynn9aN8K8DHSraARhlPrQNZak0o8gns/3PGkFrMEpTveuz9O3g 2tzOlljwO36myfYZChhk3TePQX3ejVRa78mH2Wgt26u+NRWojXXeVrfpnxvYbOfiVDjk q10Kmha8kKpC9I+dWhv5mkrpkKwSaDfMZqwtOlwviNdP3KMP+uUCNG2rX+Bbef2DXy3y ThNs53HATKOmN5EL+1kTMMPxAuf/0vwtk67YkO5G3bLrpJov008LdtNpAKqr5my/PAG9 ZAfqwvT/x+R8oH9DsXi3K32GdkDwLrBenp77lYljrIs6Yy5p3OApkp/mmvXy9ex/rZ3T yUOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ZOwmy3Tc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v13si4288072ejw.696.2020.07.10.10.53.44; Fri, 10 Jul 2020 10:54:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ZOwmy3Tc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728117AbgGJRww (ORCPT + 99 others); Fri, 10 Jul 2020 13:52:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726962AbgGJRwv (ORCPT ); Fri, 10 Jul 2020 13:52:51 -0400 Received: from mail-io1-xd42.google.com (mail-io1-xd42.google.com [IPv6:2607:f8b0:4864:20::d42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55766C08C5DD for ; Fri, 10 Jul 2020 10:52:51 -0700 (PDT) Received: by mail-io1-xd42.google.com with SMTP id q74so6952880iod.1 for ; Fri, 10 Jul 2020 10:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=X4sn28XQAInZ6oRA/4Miu/453HHyk6jIfO1A6nENhso=; b=ZOwmy3TcyKaYApKKIQPxqj0LaKTYVT7Yme9kGN/xV7hHfvlyAwatmbOYm1j0fas7mF 5ROVhfgF9tarevn4hiEZhn7AcCnN6kSKkA53aFbJdS7IVJqgfI7pX7sMVIOphfExGnpg 5JtVT5rTSDrM8HYhogou6Z8s46HSr3YVCvtQq+aJYqXxuyYm4Z7Ur0DOvFv80a/ISPTD FK/UA61DACU3FI3jSQ54LvnEZ5CntbU7BYeHIHHiaiA++Bh5kd4Hlz76H+3lbj4IFkZo +ddZgu2t9g24vcKnMae55i04/pg+X13EK++ZigSSfoJtwccft/3D6UK6AyWtLLeoxFld yr2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=X4sn28XQAInZ6oRA/4Miu/453HHyk6jIfO1A6nENhso=; b=jUQ8uwIy8C4XDur1Kpxulxw1lJ+tubIjWrcxFHA/4G5sUvabbN9ayDLnxRyeq0KxbW HsYknzWnp2J9DZAE4xDLjxE9tqgAxFxn50CI0m9cxwOvFKlSvo5D31KXpXJZyh74+CAc frUUiaZ4/Ir57Fe16Ho8aGNaznkWrQoaW510e3G/YdSEsrQUv6CQLUPwvRE/qdOrUn/R 4i+X0VpPW0up9ZLC98oYnBcBSlaWhGnN+h5UN0zy9GQj3PrZ9BUlZjHA/Wuuq50jRC4q xTir6QLCKmTVitaytz6R9w+z0Tjf6PkeeldOhy5otjjf2dDANFiT17tJIwyB6yNq2sZq SRQQ== X-Gm-Message-State: AOAM5306S+sAxnpQx3xbJJe3lD1yyYST0MFVVNqmg54YHqUzA7t1wBpc IOr1feQssuGFPdhaHCf8/dUjEw== X-Received: by 2002:a5d:8d12:: with SMTP id p18mr48507405ioj.148.1594403570543; Fri, 10 Jul 2020 10:52:50 -0700 (PDT) Received: from [192.168.1.58] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id c29sm3947388ilg.53.2020.07.10.10.52.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 Jul 2020 10:52:49 -0700 (PDT) Subject: Re: [PATCH RFC 2/3] io_uring: add IOURING_REGISTER_RESTRICTIONS opcode To: Stefano Garzarella Cc: Sargun Dhillon , Kees Cook , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Hardening , Jann Horn , Aleksa Sarai , Christian Brauner , Stefan Hajnoczi , io-uring@vger.kernel.org, Alexander Viro , Jeff Moyer References: <20200710141945.129329-1-sgarzare@redhat.com> <20200710141945.129329-3-sgarzare@redhat.com> From: Jens Axboe Message-ID: Date: Fri, 10 Jul 2020 11:52:48 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200710141945.129329-3-sgarzare@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/10/20 8:19 AM, Stefano Garzarella wrote: > The new io_uring_register(2) IOURING_REGISTER_RESTRICTIONS opcode > permanently installs a feature whitelist on an io_ring_ctx. > The io_ring_ctx can then be passed to untrusted code with the > knowledge that only operations present in the whitelist can be > executed. > > The whitelist approach ensures that new features added to io_uring > do not accidentally become available when an existing application > is launched on a newer kernel version. Keeping with the trend of the times, you should probably use 'allowlist' here instead of 'whitelist'. > > Currently is it possible to restrict sqe opcodes and register > opcodes. It is also possible to allow only fixed files. > > IOURING_REGISTER_RESTRICTIONS can only be made once. Afterwards > it is not possible to change restrictions anymore. > This prevents untrusted code from removing restrictions. A few comments below. > @@ -337,6 +344,7 @@ struct io_ring_ctx { > struct llist_head file_put_llist; > > struct work_struct exit_work; > + struct io_restriction restrictions; > }; > > /* Since very few will use this feature, was going to suggest that we make it dynamically allocated. But it's just 32 bytes, currently, so probably not worth the effort... > @@ -5491,6 +5499,11 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req, > if (unlikely(!fixed && io_async_submit(req->ctx))) > return -EBADF; > > + if (unlikely(!fixed && req->ctx->restrictions.enabled && > + test_bit(IORING_RESTRICTION_FIXED_FILES_ONLY, > + req->ctx->restrictions.restriction_op))) > + return -EACCES; > + > return io_file_get(state, req, fd, &req->file, fixed); > } This one hurts, though. I don't want any extra overhead from the feature, and you're digging deep in ctx here to figure out of we need to check. Generally, all the checking needs to be out-of-line, and it needs to base the decision on whether to check something or not on a cache hot piece of data. So I'd suggest to turn all of these into some flag. ctx->flags generally mirrors setup flags, so probably just add a: unsigned int restrictions : 1; after eventfd_async : 1 in io_ring_ctx. That's free, plenty of room there and that cacheline is already pulled in for reading. -- Jens Axboe