Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp763059pxp; Fri, 11 Mar 2022 14:28:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJz2Cg3GQ5dlylSEcO3ggkbP1AtYSGoXHdWjZxHiH9ixycfUZjQxh67SH9gVHA1RaYfi22UV X-Received: by 2002:a05:6a00:2353:b0:4f7:6600:62ed with SMTP id j19-20020a056a00235300b004f7660062edmr12403509pfj.44.1647037718138; Fri, 11 Mar 2022 14:28:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647037718; cv=none; d=google.com; s=arc-20160816; b=d7yesNi1mpeA5OmSJn0hqRLzuGAI90M1gqmtiWhT1x7sLvUmZmmzRVquVchZVIqjUS kj8DsbmBP+4G+7aRSq5VCEtVIo1KpSpL/CsmlvTUTBgar2kmpwVfyARQWo42ysRL4X5p T6u4EoGUKuIEVZwMT8rcEYLPH+GS6fwLUgG6GhnPx1YFlt+e79mxNyzO4lKP6vxESsEx Y0CgHRWI5V5oeWQT6hir1ggILADZEbnCBYHPak+ucEVwOfy5rJyj4TflDFvlmJHuDpHD iIFnk1Srd7jOAwGDU35Rctd/BTfpOG9TC40QFQQuYGQ3lE1j5KLEn1h11sSBe5J+L6a9 5njA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0lq4ouGfnimqyWNmVOB40NOttTua5Z3ILiH3wkeBRpo=; b=ze0HY6hcPR6z3WNmDRxC2ndgpnpi7aaUNFY+2ZLOjJg9gFXaqKJLiq/II+Ze8nx7aN zmigfM3MHTRk+x3dgS4uzEwlNgbRbQZPlJxL5V1DxRwaGHCvjr4ILbcyLKLc+i6ucZad V6pPBzbTQLS+qF1BrigpRMlwJ/kewelFGLgt9aev8GcaDVpaMNSae1+ElpBh6a0C5r0G rxD/VNtSF85ViKyRqNfhzKCFSzZ0NyhaOU25A/95wliY/4/8O5EHyi3TvvWOCg4pwZb0 DhdJnTcqYjH5zj65VU+qEziKMsgLXLuu104lhXPfU7AUr04B4V3TLaoMZQ13XEEWV4NJ tDSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=QnqdrUzB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id g2-20020a056a000b8200b004f6f2ff497asi8702083pfj.286.2022.03.11.14.28.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 14:28:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=QnqdrUzB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 112E1191A08; Fri, 11 Mar 2022 13:36:54 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350902AbiCKRr0 (ORCPT + 99 others); Fri, 11 Mar 2022 12:47:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242962AbiCKRrJ (ORCPT ); Fri, 11 Mar 2022 12:47:09 -0500 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CAC813C27B for ; Fri, 11 Mar 2022 09:46:05 -0800 (PST) Received: by mail-ej1-x635.google.com with SMTP id p15so20669850ejc.7 for ; Fri, 11 Mar 2022 09:46:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0lq4ouGfnimqyWNmVOB40NOttTua5Z3ILiH3wkeBRpo=; b=QnqdrUzB4vfjCnIzoOfEpuPmFEt4g9jkQcTZFN44Pc20qzILMoaIngY4z3nAZQHLdP bHULc4fwCM+FmxZ15Ytz64Apd/uLaFUR2LiYth7iKww5zbl1cSw5i6KhiwBrkRPrde5K J0vQS3q7PwLlLjANJyDnm298DU/3V8MC5oMo/3aDpYMjOpCUqZkdC92JQlE/m5OgszZa OFmG5glJIoAEMOW/FpHLOIYs4Si+Fn5/henMZHrXHH365zzYeecfkdtbHPv52tEP/E7J Y1IBP2oQQtljVYN26XK4pMGMnA7vnCVwekXvd+/Xs4cGBrGJkYOb8tCUV8kLKbSX+l9b +MGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0lq4ouGfnimqyWNmVOB40NOttTua5Z3ILiH3wkeBRpo=; b=q2buUmy/6sIzrW2hzjIxEKeInX0LuPF6gty2fc78dBrrjbNMCmEdNYKaQ0dXv9qTpp 4+IrpCnw1a2KDEWe/Ms7RHRnKYfUTCq6cw5W1oezQQ+PFBsmY1PtbjTlcYKIB9/6BTfa kXjwqg1cRA/iR00ZGptnXYJyvJuDvatfpcZE5ItZ1jz/5U966BkRbjI8MNabe3WPMsAA lFTl5y1ubId0CtfAFvxPdoQBYzLMp/+NUCW+D7iU5LIMBY2xeCPPIJdaWjEhJECjBdTs nIktuwkNkVGDdMNdFPUg1VoeNyyQQRdSRpmPO1k+OJFujNiWkN3h0SjdqY8puaowsw73 Tl/Q== X-Gm-Message-State: AOAM532/wFfPSjhwgTRA9RoUzQV7VUw/adNeCR9eyhuGXtQmaCJ/HnlJ DmjfXFm63I2Xw3FJU4pwdpA= X-Received: by 2002:a17:906:52c7:b0:6ce:a880:50a3 with SMTP id w7-20020a17090652c700b006cea88050a3mr9345194ejn.437.1647020763935; Fri, 11 Mar 2022 09:46:03 -0800 (PST) Received: from orion.localdomain ([93.99.228.15]) by smtp.gmail.com with ESMTPSA id cf17-20020a170906b2d100b006daa59af421sm3232771ejb.149.2022.03.11.09.46.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 09:46:01 -0800 (PST) Received: by orion.localdomain (Postfix, from userid 1003) id 855A9A00C0; Fri, 11 Mar 2022 18:46:03 +0100 (CET) From: =?UTF-8?q?Jakub=20Mat=C4=9Bna?= To: linux-mm@kvack.org Cc: patches@lists.linux.dev, linux-kernel@vger.kernel.org, vbabka@suse.cz, mhocko@kernel.org, mgorman@techsingularity.net, willy@infradead.org, liam.howlett@oracle.com, hughd@google.com, kirill@shutemov.name, riel@surriel.com, rostedt@goodmis.org, peterz@infradead.org, =?UTF-8?q?Jakub=20Mat=C4=9Bna?= Subject: [RFC PATCH v2 2/4] [PATCH 2/4] mm: adjust page offset in mremap Date: Fri, 11 Mar 2022 18:46:00 +0100 Message-Id: <20220311174602.288010-3-matenajakub@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220311174602.288010-1-matenajakub@gmail.com> References: <20220311174602.288010-1-matenajakub@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adjust page offset of a VMA when it's moved to a new location by mremap. This is made possible for all VMAs that do not share their anonymous pages with other processes. Previously this was possible only for not yet faulted VMAs. When the page offset does not correspond to the virtual address of the anonymous VMA any merge attempt with another VMA will fail. Signed-off-by: Jakub Matěna --- mm/mmap.c | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++----- mm/rmap.c | 37 +++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 6 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 8d817b11c656..4f9c6ca7ff4e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3218,6 +3218,59 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma) return 0; } +/** + * update_faulted_pgoff() - Update faulted pages of a vma + * @vma: VMA being moved + * @addr: new virtual address + * @pgoff: pointer to pgoff which is updated + * If the vma and its pages are not shared with another process, update + * the new pgoff and also update index parameter (copy of the pgoff) in + * all faulted pages. + */ +bool update_faulted_pgoff(struct vm_area_struct *vma, unsigned long addr, pgoff_t *pgoff) +{ + unsigned long pg_iter = 0; + unsigned long pg_iters = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + /* Check vma is not shared with other processes */ + if (vma->anon_vma->root != vma->anon_vma || !rbt_no_children(vma->anon_vma)) + return false; + + /* Check all pages are not shared */ + for (; pg_iter < pg_iters; ++pg_iter) { + bool pages_not_shared = true; + unsigned long shift = pg_iter << PAGE_SHIFT; + struct page *phys_page = follow_page(vma, vma->vm_start + shift, FOLL_GET); + + if (phys_page == NULL) + continue; + + /* Check page is not shared with other processes */ + if (page_mapcount(phys_page) > 1) + pages_not_shared = false; + put_page(phys_page); + if (!pages_not_shared) + return false; + } + + /* Update index in all pages to this new pgoff */ + pg_iter = 0; + *pgoff = addr >> PAGE_SHIFT; + + for (; pg_iter < pg_iters; ++pg_iter) { + unsigned long shift = pg_iter << PAGE_SHIFT; + struct page *phys_page = follow_page(vma, vma->vm_start + shift, FOLL_GET); + + if (phys_page == NULL) + continue; + lock_page(phys_page); + phys_page->index = *pgoff + pg_iter; + unlock_page(phys_page); + put_page(phys_page); + } + return true; +} + /* * Copy the vma structure to a new location in the same mm, * prior to moving page table entries, to effect an mremap move. @@ -3231,15 +3284,19 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, struct mm_struct *mm = vma->vm_mm; struct vm_area_struct *new_vma, *prev; struct rb_node **rb_link, *rb_parent; - bool faulted_in_anon_vma = true; + bool anon_pgoff_updated = false; /* - * If anonymous vma has not yet been faulted, update new pgoff + * Try to update new pgoff for anonymous vma * to match new location, to increase its chance of merging. */ - if (unlikely(vma_is_anonymous(vma) && !vma->anon_vma)) { - pgoff = addr >> PAGE_SHIFT; - faulted_in_anon_vma = false; + if (unlikely(vma_is_anonymous(vma))) { + if (!vma->anon_vma) { + pgoff = addr >> PAGE_SHIFT; + anon_pgoff_updated = true; + } else { + anon_pgoff_updated = update_faulted_pgoff(vma, addr, &pgoff); + } } if (find_vma_links(mm, addr, addr + len, &prev, &rb_link, &rb_parent)) @@ -3265,7 +3322,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct **vmap, * safe. It is only safe to keep the vm_pgoff * linear if there are no pages mapped yet. */ - VM_BUG_ON_VMA(faulted_in_anon_vma, new_vma); + VM_BUG_ON_VMA(!anon_pgoff_updated, new_vma); *vmap = vma = new_vma; } *need_rmap_locks = (new_vma->vm_pgoff <= vma->vm_pgoff); diff --git a/mm/rmap.c b/mm/rmap.c index 6a1e8c7f6213..96273d6a9796 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -387,6 +387,43 @@ int anon_vma_fork(struct vm_area_struct *vma, struct vm_area_struct *pvma) return -ENOMEM; } +/* + * Used by rbt_no_children to check node subtree. + * Check if none of the VMAs connected to the node subtree via + * anon_vma_chain are in child relationship to the given anon_vma. + */ +bool rbst_no_children(struct anon_vma *av, struct rb_node *node) +{ + struct anon_vma_chain *model; + struct anon_vma_chain *avc; + + if (node == NULL) /* leaf node */ + return true; + avc = container_of(node, typeof(*(model)), rb); + if (avc->vma->anon_vma != av) + /* + * Inequality implies avc belongs + * to a VMA of a child process + */ + return false; + return (rbst_no_children(av, node->rb_left) && + rbst_no_children(av, node->rb_right)); +} + +/* + * Check if none of the VMAs connected to the given + * anon_vma via anon_vma_chain are in child relationship + */ +bool rbt_no_children(struct anon_vma *av) +{ + struct rb_node *root_node; + + if (av == NULL || av->degree <= 1) /* Higher degree might not necessarily imply children */ + return true; + root_node = av->rb_root.rb_root.rb_node; + return rbst_no_children(av, root_node); +} + void unlink_anon_vmas(struct vm_area_struct *vma) { struct anon_vma_chain *avc, *next; -- 2.34.1