Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp3979126rdh; Tue, 28 Nov 2023 08:37:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IFqZEqYBsqkgBnOL9psCHy4F1/KzSd21FenoBP2ytLHNxz0CQmHOlOcQGa6TmHAjYXoxPRC X-Received: by 2002:a05:6a20:3944:b0:18b:b858:17a5 with SMTP id r4-20020a056a20394400b0018bb85817a5mr20701783pzg.28.1701189448599; Tue, 28 Nov 2023 08:37:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701189448; cv=none; d=google.com; s=arc-20160816; b=Pk/PZkxKEdMG2tvUZulH6aaTJam/tnVoKS2o8VE4dNQ/NBma+PPkUuef+SCnZ7xUK1 dLt3+wB//FXNc3G2hejizaQxeCP5TRbK7xLBd+qmJVJzDkTGpEzRE2PwSVA4mvIOeYml dW9aTRVnpz0iXMH8qatG4WOKnlurgh0ThaUtf5e7OsBQwygL29eMO7Aui4R27GlvfEGE UO2BilPvcVEPopsq6k2P5JCzRJ5py1aykEqRd/W611D4KTbJiz9BnUFL6udo2SdPHMJf 4xlZ7X4xrowue0698s1OMX7YUcrKICF8B28XtZmzajXGapAwGnnAiLmscOi+EDQ/X0yy snmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=dcliY1XQfKK+/dtT+85ru7HI6htSCnB/zKwxrnWgARo=; fh=/m51Y97TAduMh8+mKb845oLAfUuUr+WcYMKloowUcP8=; b=uWYK4N63vNpzzydM9Cn8k9iua6PFpReAnT7qvhrkohvllBV2FEzBNQYVL3sKSwMMa7 lvOILlRLk+XBaNbP+Z2PbrMXZg2uDaiMW5l4nlDQfaWfaWro6lrj/Tt5TyIgba5tyyxf KiOCuQs0JItilXap4aDM7/WIcSRKiFFzdG/T6FBEH/da/Ou8QjsEcjCZhQuRsrz4LR69 5gVE+kLXbQhp0tTr8apzdnmdbXE/3VKcd31jER68FBCeTPOiO2EdaFyIFO7QhyPCVbMi KVW8etUFB6AS+XL1T2HDIVHC+qVAWGEuGb5vRgR3AyfCaWmVnuNqvSO2GPxIIVSi5Rlq 1Llw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Ic1RryLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id i30-20020a63585e000000b005bdfd3a0d3esi12514797pgm.185.2023.11.28.08.37.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 08:37:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Ic1RryLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 41618806D801; Tue, 28 Nov 2023 08:37:26 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344432AbjK1QhI (ORCPT + 99 others); Tue, 28 Nov 2023 11:37:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229658AbjK1QhH (ORCPT ); Tue, 28 Nov 2023 11:37:07 -0500 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7588EAB for ; Tue, 28 Nov 2023 08:37:13 -0800 (PST) Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-54744e66d27so13437a12.0 for ; Tue, 28 Nov 2023 08:37:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701189432; x=1701794232; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dcliY1XQfKK+/dtT+85ru7HI6htSCnB/zKwxrnWgARo=; b=Ic1RryLi+zzyOBwb40F0pEBKuPJbVkDgdRcpv1LkA5tG1TFVKSDyyS0kEc51eexNGq hQCTlVXNiSsdrSrEILgcdyH9b6GxdB6PcD4ZzoSRx/lGh9Wrv0Xray5XjBuMUVXk92al ipdmaP2Wo/WZK+yhhSwmm2vqLw1QhEPdl2mN29xOzv+wT5IN4UzjutOOO4aQRn4D2KTZ +z5KGemi8x/DyfiyZ044dEVjBTU+38avbJmy3SUGskuWeN2HTRlwtsPrj/OY/YQEeqeh M8sCNW+OqlYJzQ7s51Viu/vnSYNgEaf/PtBn7712TG0tK5oxNdE0RIzj5DXKjUCraPQs Y89A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701189432; x=1701794232; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dcliY1XQfKK+/dtT+85ru7HI6htSCnB/zKwxrnWgARo=; b=PW3Nz78lsRsRLxV73KyCgsXznkZ0X6YucVjrAsta3S4kdHTiYz48T/L48j1GZp16oz tPRR74+UWXlW3bdAL60we5lEyUSwPg8GWtDtatc7b80ZG4T4kRohqOoSVrjjosyJBUmx g7kGMTdUnqI+9KgE4dF2JLc2v4FqlFXItyT5zPSfvBD+dyP4OzlUamvdHwd7DR0iVMxk G9LbvIiL7Tu02mSp/1aYtwgm7V6dkdQWA/1d7MHCi185Ry0/gC44pisAPy2H+0Ry1OH1 z71weEuMWouDtng02gWPmTipjaX6e5qNQLERZhijxN+0ei+xuFCdEOEAi2u0YY15Lah7 RmBA== X-Gm-Message-State: AOJu0YwGyyAeYpxNwjL0qHWylmPcPtXGGUHCpPKFtsm15GXqLsZHP3DG K/48zQ5SjGV/6qbxo+3+1743IA1BiYJlaCo0Hn4+Dg== X-Received: by 2002:a05:6402:b83:b0:544:e249:be8f with SMTP id cf3-20020a0564020b8300b00544e249be8fmr709918edb.1.1701189431791; Tue, 28 Nov 2023 08:37:11 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Jann Horn Date: Tue, 28 Nov 2023 17:36:34 +0100 Message-ID: Subject: Re: io_uring: risky use of task work, especially wrt fdget() To: Jens Axboe Cc: Pavel Begunkov , io-uring , kernel list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 28 Nov 2023 08:37:26 -0800 (PST) On Tue, Nov 28, 2023 at 5:19=E2=80=AFPM Jens Axboe wrote: > On 11/28/23 8:58 AM, Jens Axboe wrote: > > On 11/27/23 2:53 PM, Jann Horn wrote: > >> Hi! > >> > >> I noticed something that I think does not currently cause any > >> significant security issues, but could be problematic in the future: > >> > >> io_uring sometimes processes task work in the middle of syscalls, > >> including between fdget() and fdput(). My understanding of task work > >> is that it is expected to run in a context similar to directly at > >> syscall entry/exit: task context, no locks held, sleeping is okay, and > >> it doesn't execute in the middle of some syscall that expects private > >> state of the task_struct to stay the same. > >> > >> An example of another user of task work is the keyring subsystem, > >> which does task_work_add() in keyctl_session_to_parent() to change the > >> cred pointers of another task. > >> > >> Several places in io_uring process task work while holding an fdget() > >> reference to some file descriptor. For example, the io_uring_enter > >> syscall handler calls io_iopoll_check() while the io_ring_ctx is only > >> referenced via fdget(). This means that if there were another kernel > >> subsystem that uses task work to close file descriptors, io_uring > >> would become unsafe. And io_uring does _almost_ that itself, I think: > >> io_queue_worker_create() can be run on a workqueue, and uses task work > >> to launch a worker thread from the context of a userspace thread; and > >> this worker thread can then accept commands to close file descriptors. > >> Except it doesn't accept commands to close io_uring file descriptors. > >> > >> A closer miss might be io_sync_cancel(), which holds a reference to > >> some normal file with fdget()/fdput() while calling into > >> io_run_task_work_sig(). However, from what I can tell, the only things > >> that are actually done with this file pointer are pointer comparisons, > >> so this also shouldn't have significant security impact. > >> > >> Would it make sense to use fget()/fput() instead of fdget()/fdput() in > >> io_sync_cancel(), io_uring_enter and io_uring_register? These > >> functions probably usually run in multithreaded environments anyway > >> (thanks to the io_uring worker threads), so I would think fdget() > >> shouldn't bring significant performance savings here? > > > > Let me run some testing on that. It's a mistake to think that it's > > usually multithreaded, generally if you end up using io-wq then it's no= t > > a fast path. A fast networked setup, for example, would never touch the > > threads and hence no threading would be implied by using io_uring. Ditt= o > > on the storage front, if you're just reading/writing or eg doing polled > > IO. That said, those workloads are generally threaded _anyway_ - not > > because of io_uring, but because that's how these kinds of workloads ar= e > > written to begin with. Aah, because with polled I/O, when the fd is signalled as ready, the actual execution of work is done via task_work? Thanks for the explanation, I missed that. > > So probably won't be much of a concern to do the swap. The only > > "interesting" part of the above mix of cancel/register/enter is > > obviously the enter part. The rest are not really fast path. > > Did all three and ran the usual testing, which just so happens to be > multithreaded to begin with anyway. No discernable change from using > fget/fput over fdget/fdput. Oh, nice! > IOW, we may as well do this. Do you want to send a patch? Or I can send > out mine, up to you. Ah, if you already cooked up a patch to do the testing, I guess it's easier if you commit that?