Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp3169708ybl; Sun, 26 Jan 2020 21:32:22 -0800 (PST) X-Google-Smtp-Source: APXvYqz3CDUGDOakY+GEk2UA+ikPqXbYGu4i9ldb2VC+ZC2G9YhGnmQbkPLwdnM7Izco8g3RpFSY X-Received: by 2002:a05:6830:1d4c:: with SMTP id p12mr11670549oth.198.1580103142569; Sun, 26 Jan 2020 21:32:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580103142; cv=none; d=google.com; s=arc-20160816; b=JdWWi2dks7lImkVBnjrWdfjal0yt6OFAY8yK55vjdVQOG7WQgvlma0y9Yr1A8ZrXfi M0ujAnDG+eOSKEy1t1i5bRdiQ8e51EPunwH1FfNxoCGcN90ZKpMBS7o7hiU8kbB/LYkJ ZRPZot7keo/McvLKZ4N1dqk+YPlgvY105qw9vGfd8tRnHuVf9g2lPxN4nmCmXzTPrwC5 6FjoJc7oqDUK4qXsXMi0BVoJyuI7nPRMW/NxNhjW8k3QQEtTqeVONHmQHX+nNPc1a6ej qeiD6iqeqmvoBdOYBMPsEL1WkTAO54q1mc6prpa6lb6ekIy2Bm1mYt9clt54okCJlTx+ j1jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:from :subject:references:mime-version:message-id:in-reply-to:date :dkim-signature; bh=DI0Lh6q9irgWn+IO9tpXRLACEMIkrMuflR08fFBz6r4=; b=AlWB1be6b1hzFhjNqQ2gqWXScCVLS4C6laMHKfNh5zmE4pS97V4+ZN0Nf+0OWLooYw obeRQGXPH3HxwkxY5vQFwytGdQe3ndKxaWQSLGp7Bdxzhm8gFGk7RaFpe5G8QZlqcwPs 2moWXmWJrhQthtPlvpQOeHK8aTknPpUVcFMLw2FFvvfVfiK1jzPj5uGT3tZwEC9ESUTb e3Rj1MoxFZe2oYqx+yHn4qoXS9Ts/LIsgYOh/CNmCNzq4IMg71V61TZhKUVW5C7K4+bG SYJ6cH/eK8XRTSbv2Bbb9evvJME90xK9l3jinig3dBluZ9ZtBKfDSP9ebDgrNjovnZ9q Wciw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pI71r8dS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b11si3199010oie.152.2020.01.26.21.32.08; Sun, 26 Jan 2020 21:32:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pI71r8dS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726267AbgA0FbJ (ORCPT + 99 others); Mon, 27 Jan 2020 00:31:09 -0500 Received: from mail-vk1-f202.google.com ([209.85.221.202]:41796 "EHLO mail-vk1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725763AbgA0FbJ (ORCPT ); Mon, 27 Jan 2020 00:31:09 -0500 Received: by mail-vk1-f202.google.com with SMTP id i123so4008094vkg.8 for ; Sun, 26 Jan 2020 21:31:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=DI0Lh6q9irgWn+IO9tpXRLACEMIkrMuflR08fFBz6r4=; b=pI71r8dSdD44XMVUvUhi7UqeZJ/7Mpi/EmVb+iICB+EryfZn+dv7YyWzZfd8B4/OoI K85VRP2TqR3qk4Xgqmv0mvqGYPhzU4lOdDVESqkbWTF4xI6NnTQciyO7zEBiOpK8139q 8GI82lSq/qYWaE2TexNDH99hLcSvrySrOhxfgMXt80JsvY1M04jx8LgiZiHpK0DOqUFI CuBsrbfM6mnTiTR7vG998LXps6/wSgF4Z7d5KLpD9j1GAuvCTVN3xCzO5UzDsgYCOOPE DtBU5qvgERwToxxUHyck+25YtCV5hxrbEAAZttfpyKgvD9GshDGx+rIuwEGR4ZBsz3+4 m0Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=DI0Lh6q9irgWn+IO9tpXRLACEMIkrMuflR08fFBz6r4=; b=UKnFhIaUPVdtMX6rv3cTKd4NvKQe0G2zmq5E+kzMIr/FKET/eHxlGkVwj+VrMTprCf mfudOAegJLsrZOmHP5ZL5JC8Nw9haPySbtW6u+CLigdolNVrn0P1GG2scui//oXzI8G7 Xkd/FO75eeF/58zPZ71rocMdXBlYrVmWwZZYU7/a/rrrgEDsnOAtpmAudXduigG59ivh ICrkadyWz9klWgYcQ7pmQvaP9encw7ZUN+zJekqkSNZk6O2obtENFxRbHopGX0bsmO7M 6YNYjey92OLaAdYRXrt0HvHMbmVzUrOhDmfrceh8o4Eiluj8h5KuZmm4hCfazl1oY8Gu K8Mg== X-Gm-Message-State: APjAAAWyYeS+i1fwUsMMehoefusNI2QYlu2vbz471ASWVxLZ0A2sOwz2 71EV8izUWo+DUMiYN2UPt5NRKuw+GXbx X-Received: by 2002:ab0:6509:: with SMTP id w9mr8912067uam.121.1580103067944; Sun, 26 Jan 2020 21:31:07 -0800 (PST) Date: Sun, 26 Jan 2020 21:30:56 -0800 In-Reply-To: <20200123014627.71720-1-bgeffon@google.com> Message-Id: <20200127053056.213679-1-bgeffon@google.com> Mime-Version: 1.0 References: <20200123014627.71720-1-bgeffon@google.com> X-Mailer: git-send-email 2.25.0.341.g760bfbb309-goog Subject: [PATCH v3] mm: Add MREMAP_DONTUNMAP to mremap(). From: Brian Geffon To: Andrew Morton Cc: "Michael S . Tsirkin" , Brian Geffon , Arnd Bergmann , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, Andy Lutomirski , Andrea Arcangeli , Sonny Rao , Minchan Kim , Joel Fernandes , Yu Zhao , Jesse Barnes , Nathan Chancellor Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When remapping an anonymous, private mapping, if MREMAP_DONTUNMAP is set, the source mapping will not be removed. Instead it will be cleared as if a brand new anonymous, private mapping had been created atomically as part of the mremap() call. =C2=A0If a userfaultfd was watchin= g the source, it will continue to watch the new mapping. =C2=A0For a mapping that is shared or not anonymous, MREMAP_DONTUNMAP will cause the mremap() call to fail. MREMAP_DONTUNMAP requires that MREMAP_FIXED is also used. The final result is two equally sized VMAs where the destination contains the PTEs of the source. =C2=A0 =C2=A0 We hope to use this in Chrome OS where with userfaultfd we could write an anonymous mapping to disk without having to STOP the process or worry about VMA permission changes. =C2=A0 =C2=A0 This feature also has a use case in Android, Lokesh Gidra has said that "As part of using userfaultfd for GC, We'll have to move the physical pages of the java heap to a separate location. For this purpose mremap will be used. Without the MREMAP_DONTUNMAP flag, when I mremap the java heap, its virtual mapping will be removed as well. Therefore, we'll require performing mmap immediately after. This is not only time consuming but also opens a time window where a native thread may call mmap and reserve the java heap's address range for its own usage. This flag solves the problem." =C2=A0 =C2=A0 Signed-off-by: Brian Geffon --- include/uapi/linux/mman.h | 5 +++-- mm/mremap.c | 38 +++++++++++++++++++++++++++++++------- 2 files changed, 34 insertions(+), 9 deletions(-) diff --git a/include/uapi/linux/mman.h b/include/uapi/linux/mman.h index fc1a64c3447b..923cc162609c 100644 --- a/include/uapi/linux/mman.h +++ b/include/uapi/linux/mman.h @@ -5,8 +5,9 @@ #include #include =20 -#define MREMAP_MAYMOVE 1 -#define MREMAP_FIXED 2 +#define MREMAP_MAYMOVE 1 +#define MREMAP_FIXED 2 +#define MREMAP_DONTUNMAP 4 =20 #define OVERCOMMIT_GUESS 0 #define OVERCOMMIT_ALWAYS 1 diff --git a/mm/mremap.c b/mm/mremap.c index 122938dcec15..1d164e5fdff0 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -318,8 +318,8 @@ unsigned long move_page_tables(struct vm_area_struct *v= ma, static unsigned long move_vma(struct vm_area_struct *vma, unsigned long old_addr, unsigned long old_len, unsigned long new_len, unsigned long new_addr, - bool *locked, struct vm_userfaultfd_ctx *uf, - struct list_head *uf_unmap) + bool *locked, unsigned long flags, + struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap) { struct mm_struct *mm =3D vma->vm_mm; struct vm_area_struct *new_vma; @@ -408,6 +408,13 @@ static unsigned long move_vma(struct vm_area_struct *v= ma, if (unlikely(vma->vm_flags & VM_PFNMAP)) untrack_pfn_moved(vma); =20 + if (unlikely(!err && (flags & MREMAP_DONTUNMAP))) { + if (vm_flags & VM_ACCOUNT) + vma->vm_flags |=3D VM_ACCOUNT; + + goto out; + } + if (do_munmap(mm, old_addr, old_len, uf_unmap) < 0) { /* OOM: unable to split vma, just get accounts right */ vm_unacct_memory(excess >> PAGE_SHIFT); @@ -422,6 +429,7 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, vma->vm_next->vm_flags |=3D VM_ACCOUNT; } =20 +out: if (vm_flags & VM_LOCKED) { mm->locked_vm +=3D new_len >> PAGE_SHIFT; *locked =3D true; @@ -497,7 +505,7 @@ static struct vm_area_struct *vma_to_resize(unsigned lo= ng addr, =20 static unsigned long mremap_to(unsigned long addr, unsigned long old_len, unsigned long new_addr, unsigned long new_len, bool *locked, - struct vm_userfaultfd_ctx *uf, + unsigned long flags, struct vm_userfaultfd_ctx *uf, struct list_head *uf_unmap_early, struct list_head *uf_unmap) { @@ -551,6 +559,17 @@ static unsigned long mremap_to(unsigned long addr, uns= igned long old_len, goto out; } =20 + /* + * MREMAP_DONTUNMAP expands by old_len + (new_len - old_len), we will + * check that we can expand by old_len and vma_to_resize will handle + * the vma growing. + */ + if (unlikely(flags & MREMAP_DONTUNMAP && !may_expand_vm(mm, + vma->vm_flags, old_len >> PAGE_SHIFT))) { + ret =3D -ENOMEM; + goto out; + } + map_flags =3D MAP_FIXED; if (vma->vm_flags & VM_MAYSHARE) map_flags |=3D MAP_SHARED; @@ -561,7 +580,7 @@ static unsigned long mremap_to(unsigned long addr, unsi= gned long old_len, if (IS_ERR_VALUE(ret)) goto out1; =20 - ret =3D move_vma(vma, addr, old_len, new_len, new_addr, locked, uf, + ret =3D move_vma(vma, addr, old_len, new_len, new_addr, locked, flags, uf= , uf_unmap); if (!(offset_in_page(ret))) goto out; @@ -609,12 +628,16 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned= long, old_len, addr =3D untagged_addr(addr); new_addr =3D untagged_addr(new_addr); =20 - if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) + if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP)) { return ret; + } =20 if (flags & MREMAP_FIXED && !(flags & MREMAP_MAYMOVE)) return ret; =20 + if (flags & MREMAP_DONTUNMAP && !(flags & MREMAP_FIXED)) + return ret; + if (offset_in_page(addr)) return ret; =20 @@ -634,7 +657,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned l= ong, old_len, =20 if (flags & MREMAP_FIXED) { ret =3D mremap_to(addr, old_len, new_addr, new_len, - &locked, &uf, &uf_unmap_early, &uf_unmap); + &locked, flags, &uf, &uf_unmap_early, + &uf_unmap); goto out; } =20 @@ -712,7 +736,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned l= ong, old_len, } =20 ret =3D move_vma(vma, addr, old_len, new_len, new_addr, - &locked, &uf, &uf_unmap); + &locked, flags, &uf, &uf_unmap); } out: if (offset_in_page(ret)) { --=20 2.25.0.341.g760bfbb309-goog