Received: by 2002:a05:7412:d1aa:b0:fc:a2b0:25d7 with SMTP id ba42csp846066rdb; Mon, 29 Jan 2024 23:25:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IHvT+5mXqOXCIG6OFbOdqSRgCFOP9hJntlx/qgRKA7U0uZ29cf/8tq01cTJhOu3jfMPDa/V X-Received: by 2002:a05:6359:2182:b0:178:640b:38d1 with SMTP id lr2-20020a056359218200b00178640b38d1mr4612605rwb.22.1706599516697; Mon, 29 Jan 2024 23:25:16 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706599516; cv=pass; d=google.com; s=arc-20160816; b=BB/SwwzX1lF0p5Vv8f5Q4VdkaCMUlhmUky+peRselGXDbCmezO210yBlTPiLhiifsu m/jA0uY8zd1LYroZ0WYoj+BCvcUAKgnHHv9uZMi+pizgqf+mYKbOVZMZaF/gYXVnmH5O sCtaK/nSPTy+PfBVtnuy8C+c7pGH9DVGvbeep9seanrejfzfX8vMPT5BUVQzf8j4FCnu 5H3WjKbnHvfd6pw6s16HhvIg7ead4u6QWyUIhzBg2bQuw78vnSj+VKqvGJ27LKBmJ8mP KprVwrMHb7P8yzutZYINv9mYXEIvtVjSIyBbAFhy16Eejlzn/lV6RJXyo88EUiTWZBR6 d6AQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=SFZODhWuOJe9ypcKPgDOaPU8PaKT9gJYQpmr2mYuyJ8=; fh=S4BO/suAdLWAFQgJk6yDHQZymbD7cURKvsqevmbCJFk=; b=f5gOp3WAUQMjzPAs86ZkD9EX1CQgZ3/4MEDRpkkjLeowwjeCk60wX/lvoSrpHmXeX3 HRtHYFZ32e1Mau4YB0qKwxzCy2zUWmfd8uS3NrOncWc0chZXC8LewsDWt5Nt7oZtJ3RI 3H7Kw7sBEeY/ijvUqk5fKXZZPdaXGVkiVNiPV5IsubNzOq70Rjs43kJqT2c/ss3W3LJ8 GEck/inIYgwniz+C+0o4voWw/1ZizLYh0UZINmdLC49RA8wiuZdxEPitrEuWJ4kZK5U7 OxuoZScZ/3Pzi0xyaR0Kua15BlxZr8lJKaIFwzgWl5fmRT1/HzHcfgMD1EENm4zSgeHQ vVMQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=MO3HhY8d; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-44035-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-44035-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d30-20020a63735e000000b005d8c2880a7fsi4037336pgn.59.2024.01.29.23.25.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 23:25:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-44035-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=MO3HhY8d; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-44035-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-44035-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 94968B24E07 for ; Tue, 30 Jan 2024 07:22:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0FBD9524A2; Tue, 30 Jan 2024 07:22:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MO3HhY8d" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 843114CDEC; Tue, 30 Jan 2024 07:22:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706599341; cv=none; b=r8EYOHu2kMroyMwM/zLjivK5aQlA31GH03rORRmAnq+oi/OkKuyhnmSuf5KzSilYD3j0DvxXYBtkgF4awu8K7mgSnKWvMXLIDMHB2W1VkPMhR7W81tb2jHhUHSccKMrR34EuWh1TBk8R5huscO+GdgDOAjDcsJA/79Yb0XqnhAU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706599341; c=relaxed/simple; bh=wpEkXZNJ9bBcBiJwytPGy1LD1i67Bb+XSsQ9HLpClg8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ku2LO4c/uKDCwHdrRV9+itgYW3W3J1UHn0SRg0dwfWJujb8VZ0NVSsyjgwpBAO2TyMYBNx3BY3wpWoKSNq3YlVYym8BeXEe+RoOpCjNPWfjC7NI4NxgePDhEc94Q2xTAFBTIRsCK+9z9gbvQYlZFHzDt34Qn4y7sGEE81XSzWuk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MO3HhY8d; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id A496AC433F1; Tue, 30 Jan 2024 07:22:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706599341; bh=wpEkXZNJ9bBcBiJwytPGy1LD1i67Bb+XSsQ9HLpClg8=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=MO3HhY8doPlJvmIyGUANAcJkCitjRFBg+TzDAHT6lIrrzX4flu+v03ZO2GzjlUy5k GjV5uCZ37T5aPAcOIrnTjYz1d5MleMU8slPfR1366ZmRr9gAUbkkL1BY9i8NVFkkj3 0pNvliKCNNnUPLlJ5cc1AYhNtYHaxyoZW+G/GgSfXTOLUD44tBvIJ+ylELrukGH+zI K0AGr1ihUR6Tjy+j723PP06G3X8cJFL3fouyzRjI2Nm5W2zIDFbpYsoz840uTHnrdO PKOL9YU0bWuiXBTOAc3GWBQthmQreYCyYGT4y7k4/JzKvavC+ActhMCE8nhDg4Rv8h fm7GuzoWR0bLQ== Date: Tue, 30 Jan 2024 09:21:54 +0200 From: Mike Rapoport To: Lokesh Gidra Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, selinux@vger.kernel.org, surenb@google.com, kernel-team@android.com, aarcange@redhat.com, peterx@redhat.com, david@redhat.com, axelrasmussen@google.com, bgeffon@google.com, willy@infradead.org, jannh@google.com, kaleshsingh@google.com, ngeoffray@google.com, timmurray@google.com Subject: Re: [PATCH v2 2/3] userfaultfd: protect mmap_changing with rw_sem in userfaulfd_ctx Message-ID: References: <20240129193512.123145-1-lokeshgidra@google.com> <20240129193512.123145-3-lokeshgidra@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240129193512.123145-3-lokeshgidra@google.com> On Mon, Jan 29, 2024 at 11:35:11AM -0800, Lokesh Gidra wrote: > Increments and loads to mmap_changing are always in mmap_lock > critical section. This ensures that if userspace requests event > notification for non-cooperative operations (e.g. mremap), userfaultfd > operations don't occur concurrently. > > This can be achieved by using a separate read-write semaphore in > userfaultfd_ctx such that increments are done in write-mode and loads > in read-mode, thereby eliminating the dependency on mmap_lock for this > purpose. > > This is a preparatory step before we replace mmap_lock usage with > per-vma locks in fill/move ioctls. > > Signed-off-by: Lokesh Gidra Reviewed-by: Mike Rapoport (IBM) > --- > fs/userfaultfd.c | 40 ++++++++++++---------- > include/linux/userfaultfd_k.h | 31 ++++++++++-------- > mm/userfaultfd.c | 62 ++++++++++++++++++++--------------- > 3 files changed, 75 insertions(+), 58 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 58331b83d648..c00a021bcce4 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -685,12 +685,15 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct list_head *fcs) > ctx->flags = octx->flags; > ctx->features = octx->features; > ctx->released = false; > + init_rwsem(&ctx->map_changing_lock); > atomic_set(&ctx->mmap_changing, 0); > ctx->mm = vma->vm_mm; > mmgrab(ctx->mm); > > userfaultfd_ctx_get(octx); > + down_write(&octx->map_changing_lock); > atomic_inc(&octx->mmap_changing); > + up_write(&octx->map_changing_lock); > fctx->orig = octx; > fctx->new = ctx; > list_add_tail(&fctx->list, fcs); > @@ -737,7 +740,9 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma, > if (ctx->features & UFFD_FEATURE_EVENT_REMAP) { > vm_ctx->ctx = ctx; > userfaultfd_ctx_get(ctx); > + down_write(&ctx->map_changing_lock); > atomic_inc(&ctx->mmap_changing); > + up_write(&ctx->map_changing_lock); > } else { > /* Drop uffd context if remap feature not enabled */ > vma_start_write(vma); > @@ -783,7 +788,9 @@ bool userfaultfd_remove(struct vm_area_struct *vma, > return true; > > userfaultfd_ctx_get(ctx); > + down_write(&ctx->map_changing_lock); > atomic_inc(&ctx->mmap_changing); > + up_write(&ctx->map_changing_lock); > mmap_read_unlock(mm); > > msg_init(&ewq.msg); > @@ -825,7 +832,9 @@ int userfaultfd_unmap_prep(struct vm_area_struct *vma, unsigned long start, > return -ENOMEM; > > userfaultfd_ctx_get(ctx); > + down_write(&ctx->map_changing_lock); > atomic_inc(&ctx->mmap_changing); > + up_write(&ctx->map_changing_lock); > unmap_ctx->ctx = ctx; > unmap_ctx->start = start; > unmap_ctx->end = end; > @@ -1709,9 +1718,8 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, > if (uffdio_copy.mode & UFFDIO_COPY_MODE_WP) > flags |= MFILL_ATOMIC_WP; > if (mmget_not_zero(ctx->mm)) { > - ret = mfill_atomic_copy(ctx->mm, uffdio_copy.dst, uffdio_copy.src, > - uffdio_copy.len, &ctx->mmap_changing, > - flags); > + ret = mfill_atomic_copy(ctx, uffdio_copy.dst, uffdio_copy.src, > + uffdio_copy.len, flags); > mmput(ctx->mm); > } else { > return -ESRCH; > @@ -1761,9 +1769,8 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, > goto out; > > if (mmget_not_zero(ctx->mm)) { > - ret = mfill_atomic_zeropage(ctx->mm, uffdio_zeropage.range.start, > - uffdio_zeropage.range.len, > - &ctx->mmap_changing); > + ret = mfill_atomic_zeropage(ctx, uffdio_zeropage.range.start, > + uffdio_zeropage.range.len); > mmput(ctx->mm); > } else { > return -ESRCH; > @@ -1818,9 +1825,8 @@ static int userfaultfd_writeprotect(struct userfaultfd_ctx *ctx, > return -EINVAL; > > if (mmget_not_zero(ctx->mm)) { > - ret = mwriteprotect_range(ctx->mm, uffdio_wp.range.start, > - uffdio_wp.range.len, mode_wp, > - &ctx->mmap_changing); > + ret = mwriteprotect_range(ctx, uffdio_wp.range.start, > + uffdio_wp.range.len, mode_wp); > mmput(ctx->mm); > } else { > return -ESRCH; > @@ -1870,9 +1876,8 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) > flags |= MFILL_ATOMIC_WP; > > if (mmget_not_zero(ctx->mm)) { > - ret = mfill_atomic_continue(ctx->mm, uffdio_continue.range.start, > - uffdio_continue.range.len, > - &ctx->mmap_changing, flags); > + ret = mfill_atomic_continue(ctx, uffdio_continue.range.start, > + uffdio_continue.range.len, flags); > mmput(ctx->mm); > } else { > return -ESRCH; > @@ -1925,9 +1930,8 @@ static inline int userfaultfd_poison(struct userfaultfd_ctx *ctx, unsigned long > goto out; > > if (mmget_not_zero(ctx->mm)) { > - ret = mfill_atomic_poison(ctx->mm, uffdio_poison.range.start, > - uffdio_poison.range.len, > - &ctx->mmap_changing, 0); > + ret = mfill_atomic_poison(ctx, uffdio_poison.range.start, > + uffdio_poison.range.len, 0); > mmput(ctx->mm); > } else { > return -ESRCH; > @@ -2003,13 +2007,14 @@ static int userfaultfd_move(struct userfaultfd_ctx *ctx, > if (mmget_not_zero(mm)) { > mmap_read_lock(mm); > > - /* Re-check after taking mmap_lock */ > + /* Re-check after taking map_changing_lock */ > + down_read(&ctx->map_changing_lock); > if (likely(!atomic_read(&ctx->mmap_changing))) > ret = move_pages(ctx, mm, uffdio_move.dst, uffdio_move.src, > uffdio_move.len, uffdio_move.mode); > else > ret = -EAGAIN; > - > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(mm); > mmput(mm); > } else { > @@ -2216,6 +2221,7 @@ static int new_userfaultfd(int flags) > ctx->flags = flags; > ctx->features = 0; > ctx->released = false; > + init_rwsem(&ctx->map_changing_lock); > atomic_set(&ctx->mmap_changing, 0); > ctx->mm = current->mm; > /* prevent the mm struct to be freed */ > diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h > index 691d928ee864..3210c3552976 100644 > --- a/include/linux/userfaultfd_k.h > +++ b/include/linux/userfaultfd_k.h > @@ -69,6 +69,13 @@ struct userfaultfd_ctx { > unsigned int features; > /* released */ > bool released; > + /* > + * Prevents userfaultfd operations (fill/move/wp) from happening while > + * some non-cooperative event(s) is taking place. Increments are done > + * in write-mode. Whereas, userfaultfd operations, which includes > + * reading mmap_changing, is done under read-mode. > + */ > + struct rw_semaphore map_changing_lock; > /* memory mappings are changing because of non-cooperative event */ > atomic_t mmap_changing; > /* mm with one ore more vmas attached to this userfaultfd_ctx */ > @@ -113,22 +120,18 @@ extern int mfill_atomic_install_pte(pmd_t *dst_pmd, > unsigned long dst_addr, struct page *page, > bool newly_allocated, uffd_flags_t flags); > > -extern ssize_t mfill_atomic_copy(struct mm_struct *dst_mm, unsigned long dst_start, > +extern ssize_t mfill_atomic_copy(struct userfaultfd_ctx *ctx, unsigned long dst_start, > unsigned long src_start, unsigned long len, > - atomic_t *mmap_changing, uffd_flags_t flags); > -extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, > + uffd_flags_t flags); > +extern ssize_t mfill_atomic_zeropage(struct userfaultfd_ctx *ctx, > unsigned long dst_start, > - unsigned long len, > - atomic_t *mmap_changing); > -extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start, > - unsigned long len, atomic_t *mmap_changing, > - uffd_flags_t flags); > -extern ssize_t mfill_atomic_poison(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, atomic_t *mmap_changing, > - uffd_flags_t flags); > -extern int mwriteprotect_range(struct mm_struct *dst_mm, > - unsigned long start, unsigned long len, > - bool enable_wp, atomic_t *mmap_changing); > + unsigned long len); > +extern ssize_t mfill_atomic_continue(struct userfaultfd_ctx *ctx, unsigned long dst_start, > + unsigned long len, uffd_flags_t flags); > +extern ssize_t mfill_atomic_poison(struct userfaultfd_ctx *ctx, unsigned long start, > + unsigned long len, uffd_flags_t flags); > +extern int mwriteprotect_range(struct userfaultfd_ctx *ctx, unsigned long start, > + unsigned long len, bool enable_wp); > extern long uffd_wp_range(struct vm_area_struct *vma, > unsigned long start, unsigned long len, bool enable_wp); > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index e3a91871462a..6e2ca04ab04d 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -353,11 +353,11 @@ static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address) > * called with mmap_lock held, it will release mmap_lock before returning. > */ > static __always_inline ssize_t mfill_atomic_hugetlb( > + struct userfaultfd_ctx *ctx, > struct vm_area_struct *dst_vma, > unsigned long dst_start, > unsigned long src_start, > unsigned long len, > - atomic_t *mmap_changing, > uffd_flags_t flags) > { > struct mm_struct *dst_mm = dst_vma->vm_mm; > @@ -379,6 +379,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( > * feature is not supported. > */ > if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) { > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > return -EINVAL; > } > @@ -463,6 +464,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( > cond_resched(); > > if (unlikely(err == -ENOENT)) { > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > BUG_ON(!folio); > > @@ -473,12 +475,13 @@ static __always_inline ssize_t mfill_atomic_hugetlb( > goto out; > } > mmap_read_lock(dst_mm); > + down_read(&ctx->map_changing_lock); > /* > * If memory mappings are changing because of non-cooperative > * operation (e.g. mremap) running in parallel, bail out and > * request the user to retry later > */ > - if (mmap_changing && atomic_read(mmap_changing)) { > + if (atomic_read(ctx->mmap_changing)) { > err = -EAGAIN; > break; > } > @@ -501,6 +504,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( > } > > out_unlock: > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > out: > if (folio) > @@ -512,11 +516,11 @@ static __always_inline ssize_t mfill_atomic_hugetlb( > } > #else /* !CONFIG_HUGETLB_PAGE */ > /* fail at build time if gcc attempts to use this */ > -extern ssize_t mfill_atomic_hugetlb(struct vm_area_struct *dst_vma, > +extern ssize_t mfill_atomic_hugetlb(struct userfaultfd_ctx *ctx, > + struct vm_area_struct *dst_vma, > unsigned long dst_start, > unsigned long src_start, > unsigned long len, > - atomic_t *mmap_changing, > uffd_flags_t flags); > #endif /* CONFIG_HUGETLB_PAGE */ > > @@ -564,13 +568,13 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd, > return err; > } > > -static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > +static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, > unsigned long dst_start, > unsigned long src_start, > unsigned long len, > - atomic_t *mmap_changing, > uffd_flags_t flags) > { > + struct mm_struct *dst_mm = ctx->mm; > struct vm_area_struct *dst_vma; > ssize_t err; > pmd_t *dst_pmd; > @@ -600,8 +604,9 @@ static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > * operation (e.g. mremap) running in parallel, bail out and > * request the user to retry later > */ > + down_read(&ctx->map_changing_lock); > err = -EAGAIN; > - if (mmap_changing && atomic_read(mmap_changing)) > + if (atomic_read(&ctx->mmap_changing)) > goto out_unlock; > > /* > @@ -633,8 +638,8 @@ static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > * If this is a HUGETLB vma, pass off to appropriate routine > */ > if (is_vm_hugetlb_page(dst_vma)) > - return mfill_atomic_hugetlb(dst_vma, dst_start, src_start, > - len, mmap_changing, flags); > + return mfill_atomic_hugetlb(ctx, dst_vma, dst_start, > + src_start, len, flags); > > if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma)) > goto out_unlock; > @@ -693,6 +698,7 @@ static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > if (unlikely(err == -ENOENT)) { > void *kaddr; > > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > BUG_ON(!folio); > > @@ -723,6 +729,7 @@ static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > } > > out_unlock: > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > out: > if (folio) > @@ -733,34 +740,33 @@ static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm, > return copied ? copied : err; > } > > -ssize_t mfill_atomic_copy(struct mm_struct *dst_mm, unsigned long dst_start, > +ssize_t mfill_atomic_copy(struct userfaultfd_ctx *ctx, unsigned long dst_start, > unsigned long src_start, unsigned long len, > - atomic_t *mmap_changing, uffd_flags_t flags) > + uffd_flags_t flags) > { > - return mfill_atomic(dst_mm, dst_start, src_start, len, mmap_changing, > + return mfill_atomic(ctx, dst_start, src_start, len, > uffd_flags_set_mode(flags, MFILL_ATOMIC_COPY)); > } > > -ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, atomic_t *mmap_changing) > +ssize_t mfill_atomic_zeropage(struct userfaultfd_ctx *ctx, > + unsigned long start, > + unsigned long len) > { > - return mfill_atomic(dst_mm, start, 0, len, mmap_changing, > + return mfill_atomic(ctx, start, 0, len, > uffd_flags_set_mode(0, MFILL_ATOMIC_ZEROPAGE)); > } > > -ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, atomic_t *mmap_changing, > - uffd_flags_t flags) > +ssize_t mfill_atomic_continue(struct userfaultfd_ctx *ctx, unsigned long start, > + unsigned long len, uffd_flags_t flags) > { > - return mfill_atomic(dst_mm, start, 0, len, mmap_changing, > + return mfill_atomic(ctx, start, 0, len, > uffd_flags_set_mode(flags, MFILL_ATOMIC_CONTINUE)); > } > > -ssize_t mfill_atomic_poison(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, atomic_t *mmap_changing, > - uffd_flags_t flags) > +ssize_t mfill_atomic_poison(struct userfaultfd_ctx *ctx, unsigned long start, > + unsigned long len, uffd_flags_t flags) > { > - return mfill_atomic(dst_mm, start, 0, len, mmap_changing, > + return mfill_atomic(ctx, start, 0, len, > uffd_flags_set_mode(flags, MFILL_ATOMIC_POISON)); > } > > @@ -793,10 +799,10 @@ long uffd_wp_range(struct vm_area_struct *dst_vma, > return ret; > } > > -int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, > - unsigned long len, bool enable_wp, > - atomic_t *mmap_changing) > +int mwriteprotect_range(struct userfaultfd_ctx *ctx, unsigned long start, > + unsigned long len, bool enable_wp) > { > + struct mm_struct *dst_mm = ctx->mm; > unsigned long end = start + len; > unsigned long _start, _end; > struct vm_area_struct *dst_vma; > @@ -820,8 +826,9 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, > * operation (e.g. mremap) running in parallel, bail out and > * request the user to retry later > */ > + down_read(&ctx->map_changing_lock); > err = -EAGAIN; > - if (mmap_changing && atomic_read(mmap_changing)) > + if (atomic_read(&ctx->mmap_changing)) > goto out_unlock; > > err = -ENOENT; > @@ -850,6 +857,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, > err = 0; > } > out_unlock: > + up_read(&ctx->map_changing_lock); > mmap_read_unlock(dst_mm); > return err; > } > -- > 2.43.0.429.g432eaa2c6b-goog > -- Sincerely yours, Mike.