Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1009254ybl; Fri, 24 Jan 2020 13:42:53 -0800 (PST) X-Google-Smtp-Source: APXvYqzpPY3ejAlS5cSNrhuTH5B2w+5BJZChNS3e1WkD3J29/5Z2emSzm3TNi/FficZCSdfNncVx X-Received: by 2002:a05:6808:618:: with SMTP id y24mr608348oih.86.1579902173106; Fri, 24 Jan 2020 13:42:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579902173; cv=none; d=google.com; s=arc-20160816; b=dGZPM+leZXKSgxu5C41SUETsl/LeoW52ZJnxb4soGez8w6FtnZDmXrdtbacap+/xo3 4wgnLPTCDJWWrodTw1E9KvTqZryRP0F/zeHjsN6e0RV++HQ2fGBP2djg7U0KHQuFjTW+ eoQ+k3QRcybfG5HNToTVIqBmN0Q+ZP6gxtwE4VplaL7cLszCgNzIvFvH2Rrwl4rh4ozP /wkPCtAgUxBSBa6VQUx4Y1l2s2jCxz0BTbW8kMI00SHePToxpuCuzAT7cYuCbXJ2e32i h4oMlj2HAZ+4aNIEAcZy6q4H9JzWnnLnXUTa8EIpXnNZ4IllymSdeo7a3s50hkyGihp6 7Rdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=HDVPrh9iJxJQ8bOLJxHZkUXlGRmda9MUCxj388b9fsw=; b=ttse4Wd+UirAra+0qfe/SduBdS648zYx3SrUC13a5eha6fFDXgmV1+ralLOKVonxyV qPaU6q01GwrZcRyZjoc/uPth3QiZ43o2GmvhIt/5tH04kOYELU/5OJZlgeTvcyfySQ5u 3pnqhQdgLL35YD/cb75aor0q2I+1vOiwUzjl9aZDEV/kKImzCChpq5OIsXLE5Nc6HRyJ YcHyCU/Fb9xzit8RAvequwxQxNnloArRLLflmWLsNW7Qg2piTiacZ12tzmKsqSDk5CSd YK83eLx9IYKcTSyXPaKJl2FJKh7aojazBGXObPBot5Heek62SH0W7G/cZ2NMvw51C1tb u11A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=S02nL1hU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m7si392717oih.7.2020.01.24.13.42.41; Fri, 24 Jan 2020 13:42:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=S02nL1hU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729047AbgAXVlq (ORCPT + 99 others); Fri, 24 Jan 2020 16:41:46 -0500 Received: from mail-il1-f196.google.com ([209.85.166.196]:38701 "EHLO mail-il1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728981AbgAXVlo (ORCPT ); Fri, 24 Jan 2020 16:41:44 -0500 Received: by mail-il1-f196.google.com with SMTP id f5so2772589ilq.5 for ; Fri, 24 Jan 2020 13:41:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=HDVPrh9iJxJQ8bOLJxHZkUXlGRmda9MUCxj388b9fsw=; b=S02nL1hU4YEt50qPyBYSgo7yAha6h5JEeQEE6fRcR7aThKJYhBW3kXMmgf3Sv/kU9+ r9pcuxJMUfDluyAQRy6PYCYkXqM882+xzwIeWtjMy4sXbG8rAAuw/67NPoHkQklEfcj3 y+yJzkJaO2TcZUw+bvJTxT0b/UR2IKWUVSBrz4NC28h8ONdnfmo6jGobH/CiS0q5f66Q Df2S/Q3WrR37qHsjjPeme8ZwM5/0VtoBNWD5W6h0TzPXPqu1fAIONi3r6ccqP62CegCI PAUwU2bmKY3RnR4pcnEkxUgSfFsA8V8lWiR8usnxoSyxnAphcVuoA6BiyoadfKqm1747 sfcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=HDVPrh9iJxJQ8bOLJxHZkUXlGRmda9MUCxj388b9fsw=; b=LkqUqC7pzcWbzzDRVyV/Y/B8lXVxYFXjKNObUNQrxyHkifvFm3S8+8WoEg0wJ0XoPJ YiWcTiDuZGkPQgHHka1R/px3RstfpG33kIjNgx7Anw7/Sg49W3joA08kUzQ9Z0OZjEHr wR21TDvDZHEhzTY22kr/PiVojVHUhyAj1tq1rIoUHjzavB0nId+gVlT4Mds2n6JH9RT5 iT4y8dLjQK/N7inRjeFc1Tbviuk60GMe87W5q4lhImGJ92tIaTEqlCzcQRZWRMqOm8x/ uiK4xaxzH8MXuw/rmpJ9Y42QFutr7CxBvNX2IgR4fmMS88NiQ0a+k/6/MHich7fmG3fb OiWA== X-Gm-Message-State: APjAAAV4q8FgT3GI7IPjvsXbzT6902JyQ2zQb6bECkmRMJ3QIJx/uSw7 ovm68E/ALO7pAh0/k0IowUznHefcmrs= X-Received: by 2002:a92:d642:: with SMTP id x2mr5004490ilp.169.1579902103475; Fri, 24 Jan 2020 13:41:43 -0800 (PST) Received: from [192.168.1.159] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id v7sm1444336iom.58.2020.01.24.13.41.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 24 Jan 2020 13:41:43 -0800 (PST) Subject: Re: [PATCH 5.4 033/222] io_uring: only allow submit from owning task To: Stefan Metzmacher , Greg Kroah-Hartman , linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org, io-uring References: <20200122092833.339495161@linuxfoundation.org> <20200122092835.852416399@linuxfoundation.org> <1b4a79c1-6cda-12a8-219b-0c1c146faeff@samba.org> From: Jens Axboe Message-ID: <70b6d983-0883-5462-45a4-2305cb92cf88@kernel.dk> Date: Fri, 24 Jan 2020 14:41:42 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.4.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/24/20 12:11 PM, Stefan Metzmacher wrote: > Am 24.01.20 um 17:58 schrieb Jens Axboe: >> On 1/24/20 3:38 AM, Stefan Metzmacher wrote: >>> Am 22.01.20 um 10:26 schrieb Greg Kroah-Hartman: >>>> From: Jens Axboe >>>> >>>> commit 44d282796f81eb1debc1d7cb53245b4cb3214cb5 upstream. >>>> >>>> If the credentials or the mm doesn't match, don't allow the task to >>>> submit anything on behalf of this ring. The task that owns the ring can >>>> pass the file descriptor to another task, but we don't want to allow >>>> that task to submit an SQE that then assumes the ring mm and creds if >>>> it needs to go async. >>>> >>>> Cc: stable@vger.kernel.org >>>> Suggested-by: Stefan Metzmacher >>>> Signed-off-by: Jens Axboe >>>> Signed-off-by: Greg Kroah-Hartman >>>> >>>> >>>> --- >>>> fs/io_uring.c | 6 ++++++ >>>> 1 file changed, 6 insertions(+) >>>> >>>> --- a/fs/io_uring.c >>>> +++ b/fs/io_uring.c >>>> @@ -3716,6 +3716,12 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned >>>> wake_up(&ctx->sqo_wait); >>>> submitted = to_submit; >>>> } else if (to_submit) { >>>> + if (current->mm != ctx->sqo_mm || >>>> + current_cred() != ctx->creds) { >>>> + ret = -EPERM; >>>> + goto out; >>>> + } >>>> + >>> >>> I thought about this a bit more. >>> >>> I'm not sure if this is actually to restrictive, >>> because it means applications like Samba won't >>> be able to use io-uring at all. >>> >>> As even if current_cred() and ctx->creds describe the same >>> set of uid,gids the != won't ever match again and >>> makes the whole ring unuseable. >>> >>> I'm not sure about what the best short term solution could be... >>> >>> 1. May just doing the check for path based operations? >>> and fail individual requests with EPERM. >>> >>> 2. Or force REQ_F_FORCE_ASYNC for path based operations, >>> so that they're always executed from within the workqueue >>> with were ctx->creds is active. >>> >>> 3. Or (as proposed earlier) do the override_creds/revert_creds dance >>> (and similar for mm) if needed. >>> >>> To summaries the problem again: >>> >>> For path based operations like: >>> - IORING_OP_CONNECT (maybe also - IORING_OP_ACCEPT???) >>> - IORING_OP_SEND*, IORING_OP_RECV* on DGRAM sockets >>> - IORING_OP_OPENAT, IORING_OP_STATX, IORING_OP_OPENAT2 >>> it's important under which current_cred they are called. >>> >>> Are IORING_OP_MADVISE, IORING_OP_FADVISE and IORING_OP_FALLOCATE >>> are only bound to the credentials of the passed fd they operate on? >>> >>> The current assumption is that the io_uring_setup() syscall captures >>> the current_cred() to ctx->cred and all operations on the ring >>> are executed under the context of ctx->cred. >>> Therefore all helper threads do the override_creds/revert_creds dance. >> >> But it doesn't - we're expecting them to match, and with this change, >> we assert that it's the case or return -EPERM. >> >>> But the possible non-blocking line execution of operations in >>> the io_uring_enter() syscall doesn't do the override_creds/revert_creds >>> dance and execute the operations under current_cred(). >>> >>> This means it's random depending on filled cached under what >>> credentials an operation is executed. >>> >>> In order to prevent security problems the current patch is enough, >>> but as outlined above it will make io-uring complete unuseable >>> for applications using any syscall that changes current_cred(). >>> >>> Change 1. would be a little bit better, but still not really useful. >>> >>> I'd actually prefer solution 3. as it's still possible to make >>> use of non-blocking operations, while the security is the >>> same as solution 2. >> >> For your situation, we need to extend it anyway, and provide a way >> to swap between personalities. So yeah it won't work as-is for your >> use case, but we can work on making that the case. > > That's only for the OPENAT2 case, which we might want to use in future, > but there's a lot of work required to have async opens in Samba. > > But I have a experimental module that, just use READV, WRITEV and FSYNC > with io-uring in order to avoid our userspace helper threads. > > And that won't work anymore with the change as Samba change > current_cred() very often switch between (at least) 2 identities > root and the user. That will change the pointer of current_cred() each time. > > I mean I could work around the check by using IORING_SETUP_SQPOLL, > but I'd like to avoid that. It's easy enough to support the current creds from io_uring_enter(), where we need a bit of plumbing is if we have to go async for that particular operation. We currently have that static as well, which is why the current patch is needed. What I'm trying to say is that'll we'll need code changes to support this in any case, even just reverting that change isn't going to make the problem go away for you. Hence we just need to decide on what the best way to do this would be! -- Jens Axboe