Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2504443pxu; Sat, 28 Nov 2020 16:53:08 -0800 (PST) X-Google-Smtp-Source: ABdhPJzxG1bQEc412vNqPTgJgOhWyRhC7YnJ6/0UcTfcMkSpeudMjWFk1W9jAPbAMCDYX4ZZOCC2 X-Received: by 2002:a17:906:2612:: with SMTP id h18mr6616537ejc.469.1606611187905; Sat, 28 Nov 2020 16:53:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606611187; cv=none; d=google.com; s=arc-20160816; b=qAIQTYIQZtt2o0r+t2ugOeGPd4u38mKsHXXZH5EQk6OpwggrlYswgj4OzKfNlGkx6S LG8d+eNKs4a8WkdWOudCk7DkJmsXZi1uFSIGX623HHh7sZ9pdyjUSRhT2pjxSHYyaDvC kH2L3J5QaVVQzrORRWvvcRxW/68XBXpZYGAXr5yXRj8SyrZ+hSLbE1qE+30vWJ2w9LQW v12rEjgII6h2HcFRpwMcWNMcQBMkAuGWaythSpRDoG/DGylSCwj9R9rH4OGgoGwpXEdb 5D3KE2eikp7TjdqFGCRHyVRzuMIPCHOi/0Y08BxP5O4Ofgsjlu0faxm/TJ+ZmAE0wGdi Llaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nFkJoKPeq39q/ozwvQXnPpdStMMrP7v5qpVsCdMDZk4=; b=kK/AcXo0lFG0a1fWIGHv3K4q3EpVQM7BYr2SZtWMGZAPNDTz40UVpVDG4ZNQtSiVEr sXW/rKh0qtRHimsUbguorzBQBb5bBwldFL9e1FXOX9x71/KjnWukZSnBMTEqtWUbc7Rf dtwGxaBVdP0aZqAWZR2w7KEUMd3trR4UviMy1bdVA5GLYyIwgGqYaqW+pqqEIxBl5znk AaHdPvKR3eLhIjP78ryjIx7Y3Awr2Yl6guhkmcn8PiosL4Wld9LgN48rvXJXPk3pqssG QP/mS8IOPoqrbwEd7A248SFIkf9KMoETukQbfH87hx5lCvzpHWMmJUiDldKaNt8q6aMg QQmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C3gcjbX3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i8si6224275edv.236.2020.11.28.16.52.45; Sat, 28 Nov 2020 16:53:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=C3gcjbX3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729400AbgK2AuY (ORCPT + 99 others); Sat, 28 Nov 2020 19:50:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725616AbgK2AuX (ORCPT ); Sat, 28 Nov 2020 19:50:23 -0500 Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44226C0613D3; Sat, 28 Nov 2020 16:49:58 -0800 (PST) Received: by mail-pf1-x441.google.com with SMTP id b63so7728540pfg.12; Sat, 28 Nov 2020 16:49:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nFkJoKPeq39q/ozwvQXnPpdStMMrP7v5qpVsCdMDZk4=; b=C3gcjbX3aLSo4umLVqP+rCP8zJSdnMBS69XdnatzhP8jmv74bTeGnEW8CXv2KZSvlR +Qdgb8UFvN/NzhkGTffCThDIZho5WOhV02h1uRh3dAQSH9F+q18LUJn9XEsQbYgaApbg LmK8VjHbxuvFZhkmMgD4A1ihXkT/7rrjf0dxnts/5Lrm2V8tSY8vl24Px26ANzI/FXa1 HXtO999TyEe8v/31E4q5z8Zkpge1cuLXUucjAJ0f2/kxrm59ljfbP9k9gmICJWFWL/ov 50ZaeTjH4DZ2L3VxBcE5XDjLRxVvdFxLU/gKS0ZeT0XF705fRa7mv8ddd/v/G7ToMGGx pKBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nFkJoKPeq39q/ozwvQXnPpdStMMrP7v5qpVsCdMDZk4=; b=UOY1D++AZFuiOiB7JIVtnAl3//9TYpFWzI2l0tPGD3Yho43BJ7PlVsgTac0sDM00e5 gMYfpBa1O7YVlIJv4Nwgy+n8XBId/azpgn3TYFvGUKRvE9keVB833MzQrKhXIMmuCaAn kWvnGqrIavEIxdvvWA/voumhcL8EZNyShALYWOhw5oKrdFX/Qt0Ce7IEavNDlXMDXtG4 EOoIRzYNdAP058zKaNzysCTU63wx8MTXRMtQ6C8q3vjbj6cwSih9Nk8T3kdWJmRpWecI blq4TY3aBGKNm3369y3bw4Fjckrx7rWqVTrw4BzM9W2cPHGvmORrojPXoIA91JMLpyaN ahAw== X-Gm-Message-State: AOAM530/IfKid1w3M1iE+M6Abs7JDuvSSzEVQzCcSsI8Mb2zCarDvxvS +lgEAIKNqOssZYgqzMZvo9GQe/B+gbecfA== X-Received: by 2002:a17:90a:b118:: with SMTP id z24mr8951103pjq.14.1606610997318; Sat, 28 Nov 2020 16:49:57 -0800 (PST) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id gg19sm16444871pjb.21.2020.11.28.16.49.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 16:49:56 -0800 (PST) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-fsdevel@vger.kernel.org Cc: Nadav Amit , Jens Axboe , Andrea Arcangeli , Peter Xu , Alexander Viro , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [RFC PATCH 02/13] fs/userfaultfd: fix wrong file usage with iouring Date: Sat, 28 Nov 2020 16:45:37 -0800 Message-Id: <20201129004548.1619714-3-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20201129004548.1619714-1-namit@vmware.com> References: <20201129004548.1619714-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Nadav Amit Using io-uring with userfaultfd for reads can lead upon a fork event to the installation of the userfaultfd file descriptor on the worker kernel thread instead of the process that initiated the read. io-uring assumes that no new file descriptors are installed during read. As a result the controlling process would not be able to access the new forked process userfaultfd file descriptor. To solve this problem, Save the files_struct of the process that initiated userfaultfd syscall in the context and reload it when needed. Cc: Jens Axboe Cc: Andrea Arcangeli Cc: Peter Xu Cc: Alexander Viro Cc: io-uring@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Fixes: 2b188cc1bb85 ("Add io_uring IO interface") Signed-off-by: Nadav Amit --- fs/userfaultfd.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index c8ed4320370e..4fe07c1a44c6 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -27,6 +27,7 @@ #include #include #include +#include int sysctl_unprivileged_userfaultfd __read_mostly = 1; @@ -76,6 +77,8 @@ struct userfaultfd_ctx { bool mmap_changing; /* mm with one ore more vmas attached to this userfaultfd_ctx */ struct mm_struct *mm; + /* controlling process files as they might be different than current */ + struct files_struct *files; }; struct userfaultfd_fork_ctx { @@ -173,6 +176,7 @@ static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); mmdrop(ctx->mm); + put_files_struct(ctx->files); kmem_cache_free(userfaultfd_ctx_cachep, ctx); } } @@ -666,6 +670,8 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) ctx->mm = vma->vm_mm; mmgrab(ctx->mm); + ctx->files = octx->files; + atomic_inc(&ctx->files->count); userfaultfd_ctx_get(octx); WRITE_ONCE(octx->mmap_changing, true); fctx->orig = octx; @@ -976,10 +982,32 @@ static int resolve_userfault_fork(struct userfaultfd_ctx *ctx, struct userfaultfd_ctx *new, struct uffd_msg *msg) { + struct files_struct *files = NULL; int fd; + BUG_ON(new->files == NULL); + + /* + * This function can be called from another context than the controlling + * process, for instance, for an io-uring submission kernel thread. If + * that is the case we must ensure the correct files are being used. + */ + if (current->files != new->files) { + task_lock(current); + files = current->files; + current->files = new->files; + task_unlock(current); + } + fd = anon_inode_getfd("[userfaultfd]", &userfaultfd_fops, new, O_RDWR | (new->flags & UFFD_SHARED_FCNTL_FLAGS)); + + if (files != NULL) { + task_lock(current); + current->files = files; + task_unlock(current); + } + if (fd < 0) return fd; @@ -1986,6 +2014,8 @@ SYSCALL_DEFINE1(userfaultfd, int, flags) /* prevent the mm struct to be freed */ mmgrab(ctx->mm); + ctx->files = get_files_struct(current); + fd = anon_inode_getfd("[userfaultfd]", &userfaultfd_fops, ctx, O_RDWR | (flags & UFFD_SHARED_FCNTL_FLAGS)); if (fd < 0) { -- 2.25.1