Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp1676275ioo; Mon, 23 May 2022 00:00:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxTtJ9BE3LkFdvFz0RVVVx4n+DRPNtt91Ly0IEtZUouVpmwh16eGQz0I022yoKHhFUmceFW X-Received: by 2002:a17:90a:604d:b0:1e0:3e8a:cc43 with SMTP id h13-20020a17090a604d00b001e03e8acc43mr6272087pjm.49.1653289227987; Mon, 23 May 2022 00:00:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653289227; cv=none; d=google.com; s=arc-20160816; b=K1EQosLvzXpXOBmpWfNAhPDHevr7OkCjopbJWZdjWY7sd7W9k6HoKc19m42MILGZep V6bzY5RIkG1c5BzXZWzeN54uy0fVCwaNTmGbW/kxj9PwjpdnTkN+lW7XvekfgcDcPq2G blc9Vf2G+K+MniPqxYRUfF3BuyXhP2Udzb1uqhJnYq5+/yQqXpyFnXsK4j6EcHahgUo2 528NVRqDrHZisBKoCKwUMQyj2VKTBW1dsYyvQt3BjzNTC0NQI/YJwyfx9i0K6pDfX+Ut g05/qlaSF40nNPZpvNY682l+I5Zwrig1ei+gDGwRHCEnOm58avZEFBbrdujLGex72xUb FmeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0GM+Jpl2PB5bL1v4oYzWVj7sToezUiR+DzB0/ofcA2U=; b=ZJ/lWZ6BLKEehCHQPJQKOYIbC6hQoHvehiF8GCHiegIHNP9hyMGsnUZREShVeXS3K+ LdCqyyxE0vblXk7YV2/sOwaOki6VL09zp1z4AnXfPzsqk22RG1rYTtD2trQ59hbOxK5a 74Z9m9j14nEsgpuAUJykMkK0fI53d3X5n4HvzbslyTjx27munduVsC/rrkFdASARPJlE +f6aRUvVsDsMkTkWlPPgEsReZi4cnnTQBuoC0EgzDZnJvsE3M3juykD/nPnZfQ0UsjEi yOmoRS2mgUOloeQaDf9HNeGOqJg6igEx5q+LwPDi9giV4xZzyMu3XSF6w4eutgAosC8i 36oQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=edalsKs+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id t23-20020a63f357000000b003f61386e2e3si9052625pgj.93.2022.05.23.00.00.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 May 2022 00:00:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=edalsKs+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BB10C271B; Sun, 22 May 2022 23:25:15 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243253AbiESSaj (ORCPT + 99 others); Thu, 19 May 2022 14:30:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243970AbiESSad (ORCPT ); Thu, 19 May 2022 14:30:33 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A92FFD371 for ; Thu, 19 May 2022 11:30:30 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id bo5so5851473pfb.4 for ; Thu, 19 May 2022 11:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0GM+Jpl2PB5bL1v4oYzWVj7sToezUiR+DzB0/ofcA2U=; b=edalsKs+cr1bHh8k68c1cKtEAXiGwPDe+9mwnH4RI6wby9g93BgMBhEo2FvH1BiNE/ Rf+mHA2Avzet8LIPH8/gei2VwN5rRzg5NjmV0iOB/d33JAiwPchFJ9a5YyeQop5KvfBJ xFGjMQxMBs6fyOWtImOPcl8yz553+6++yGTZA8+IjrC73jMjrs8/l1KiPB0fSA/9IloT AxTnr8x+VXzunejPPM1Z5lNAGMyCaRpIAe5ngyNUIXVqKjwip9YS5xFhkzU/n5rYFmBI uf17WAO9nDvdwAl18G8ppoKLB/3TsZv110JHXS4noK0X0hhxJcxYWuQuROcuw2ohusip LgKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0GM+Jpl2PB5bL1v4oYzWVj7sToezUiR+DzB0/ofcA2U=; b=ptqMHr5/cECDZXcp9m6ohPWpg9oTmLIW2q3jhmu02xiSYZ2hYP9veZ5jJJchKzxkVp ARYJgPpa/aBco5NAB0RtU6JZyHvFCHG+C7di21lesFN//CBX+L7STkvpx/dysqalpRuC owDodhVwBRJJjR+BPQ0ZNk0JTKnvRX46q0u//kfceYJecMC0+VnxauFOhGyWvS7+mDW+ GrmMqNd2Y3dKXsSQF2vEYbqWWBxc9gcL4VdvY5fUC/yMpdw4bbYwoYIN5a6A+jTamMdF QUDqG/+96svIfeCC7Ix6IO1hMJxnkRGOOVa+ON54hQchELJO/p3cyADlW+jmZ2v2QXJe 3mDg== X-Gm-Message-State: AOAM530RhRtxeyKHN9joUlUy1UC2QGGxUQvPCpuBOtyNjB0FCsZnXdOC UJ9drnRHQRyEtALTRsRqQkJbOrsy1aM= X-Received: by 2002:a65:6d08:0:b0:3c6:8a08:3b9f with SMTP id bf8-20020a656d08000000b003c68a083b9fmr5013534pgb.147.1652985030008; Thu, 19 May 2022 11:30:30 -0700 (PDT) Received: from archlinux.localdomain ([140.121.198.213]) by smtp.googlemail.com with ESMTPSA id z5-20020a63e105000000b003c14af505f6sm3884674pgh.14.2022.05.19.11.30.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 May 2022 11:30:29 -0700 (PDT) From: Chih-En Lin To: Andrew Morton , linux-mm@kvack.org Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Christian Brauner , "Matthew Wilcox (Oracle)" , Vlastimil Babka , William Kucharski , John Hubbard , Yunsheng Lin , Arnd Bergmann , Suren Baghdasaryan , Chih-En Lin , Colin Cross , Feng Tang , "Eric W. Biederman" , Mike Rapoport , Geert Uytterhoeven , Anshuman Khandual , "Aneesh Kumar K.V" , Daniel Axtens , Jonathan Marek , Christophe Leroy , Pasha Tatashin , Peter Xu , Andrea Arcangeli , Thomas Gleixner , Andy Lutomirski , Sebastian Andrzej Siewior , Fenghua Yu , David Hildenbrand , linux-kernel@vger.kernel.org, Kaiyang Zhao , Huichun Feng , Jim Huang Subject: [RFC PATCH 4/6] mm: Add COW PTE fallback function Date: Fri, 20 May 2022 02:31:25 +0800 Message-Id: <20220519183127.3909598-5-shiyn.lin@gmail.com> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220519183127.3909598-1-shiyn.lin@gmail.com> References: <20220519183127.3909598-1-shiyn.lin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The lifetime of COW PTE will handle by ownership and a reference count. When the process wants to write the COW PTE, which reference count is 1, it will reuse the COW PTE instead of copying then free. Only the owner will update its RSS state and the record of page table bytes allocation. So we need to handle when the non-owner process gets the fallback COW PTE. This commit prepares for the following implementation of the reference count for COW PTE. Signed-off-by: Chih-En Lin --- mm/memory.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 76e3af9639d9..dcb678cbb051 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1000,6 +1000,34 @@ page_copy_prealloc(struct mm_struct *src_mm, struct vm_area_struct *vma, return new_page; } +static inline void cow_pte_rss(struct mm_struct *mm, struct vm_area_struct *vma, + pmd_t *pmdp, unsigned long addr, unsigned long end, bool inc_dec) +{ + int rss[NR_MM_COUNTERS]; + pte_t *orig_ptep, *ptep; + struct page *page; + + init_rss_vec(rss); + + ptep = pte_offset_map(pmdp, addr); + orig_ptep = ptep; + arch_enter_lazy_mmu_mode(); + do { + if (pte_none(*ptep) || pte_special(*ptep)) + continue; + + page = vm_normal_page(vma, addr, *ptep); + if (page) { + if (inc_dec) + rss[mm_counter(page)]++; + else + rss[mm_counter(page)]--; + } + } while (ptep++, addr += PAGE_SIZE, addr != end); + arch_leave_lazy_mmu_mode(); + add_mm_rss_vec(mm, rss); +} + static int copy_pte_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, @@ -4554,6 +4582,44 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) return VM_FAULT_FALLBACK; } +/* COW PTE fallback to normal PTE: + * - two state here + * - After break child : [parent, rss=1, ref=1, write=NO , owner=parent] + * to [parent, rss=1, ref=1, write=YES, owner=NULL ] + * - After break parent: [child , rss=0, ref=1, write=NO , owner=NULL ] + * to [child , rss=1, ref=1, write=YES, owner=NULL ] + */ +void cow_pte_fallback(struct vm_area_struct *vma, pmd_t *pmd, + unsigned long addr) +{ + struct mm_struct *mm = vma->vm_mm; + unsigned long start, end; + pmd_t new; + + BUG_ON(pmd_write(*pmd)); + + start = addr & PMD_MASK; + end = (addr + PMD_SIZE) & PMD_MASK; + + /* If pmd is not owner, it needs to increase the rss. + * Since only the owner has the RSS state for the COW PTE. + */ + if (!cow_pte_owner_is_same(pmd, pmd)) { + cow_pte_rss(mm, vma, pmd, start, end, true /* inc */); + mm_inc_nr_ptes(mm); + smp_wmb(); + pmd_populate(mm, pmd, pmd_page(*pmd)); + } + + /* Reuse the pte page */ + set_cow_pte_owner(pmd, NULL); + new = pmd_mkwrite(*pmd); + set_pmd_at(mm, addr, pmd, new); + + BUG_ON(!pmd_write(*pmd)); + BUG_ON(pmd_page(*pmd)->cow_pte_owner); +} + /* * These routines also need to handle stuff like marking pages dirty * and/or accessed for architectures that don't do it in hardware (most -- 2.36.1