Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp609298rwe; Fri, 14 Apr 2023 07:30:39 -0700 (PDT) X-Google-Smtp-Source: AKy350YkajPyYDb7wEOGHZxqGZT9HUn09Rav/JMxEHpBVSy8tTpmVreWFAdjqZylOqTHbLGXeqLh X-Received: by 2002:a05:6a20:2986:b0:e4:2a2c:86a2 with SMTP id f6-20020a056a20298600b000e42a2c86a2mr5766011pzh.41.1681482639671; Fri, 14 Apr 2023 07:30:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681482639; cv=none; d=google.com; s=arc-20160816; b=z4xP/b7oIxFsvq7xWh6N8xL/Rwg2sKmF1MAid6GKVQkXKpzqwETy6kKm2k8rE9w8WF nnkiifVE+rnLS5a1ZfqAYy26PPhQcfhqm2BxyQHcQg/hfLKBewkUtkb0F71XC6RDNJfi 07bQ2a2Blc5qd5/eph9WzjuQdcsOgNiooU4mQ2eutNKYf+FdFp8yjHS3fbl2OCRXff+Q g7FKIk8ETeaqbUQKTyBwUYRtWIdfsynh/C9E4c4Im5gwCFrs1+ry33LQSNzS8DgmvTGz crxXoBoBrZNUK802T7pEfV464elqoiAyQTgOkHCaolR+5ahiXMkMwWe7KXpbPezyN7PF fjKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dmHQTVbWw4YJlaNGarehFo5gsGgag0lLmmC2sUNSy5Y=; b=F6lmcLSavdmiGGCsiOmWQGqhcr/7qpSfczxfcE00Cb4D5NkEiJa+aMNmVq7FGpFRAj 57So7whRX2dn9SKJWjccuSjuoJ0V+8+WsV0xBdw+2ur/i1PRoYa58U8nbLB4PRy3vJqa ErayFw2lWDWV8M1bnNeXbrOPpo8QMVUfrgaPRZpF4ln1dnoh152UI58FDlwRWoJSsy+x TD7Y2VeOzx8xNFcg+qaopKnm/+nvTj4cSe3Ygq8y1W2DSSmAMKIv3tRvDsu8PYCLmKBx /9eqddVSuVHqshu59xDbzu9hsiTV/3Y4dhLc+nDiQsnkQwQSixX6wHy+788x8MQgXSQo erWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="Rg/JhHpR"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g184-20020a636bc1000000b005194af8319csi4684815pgc.558.2023.04.14.07.30.22; Fri, 14 Apr 2023 07:30:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="Rg/JhHpR"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230025AbjDNO2R (ORCPT + 99 others); Fri, 14 Apr 2023 10:28:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231196AbjDNO1n (ORCPT ); Fri, 14 Apr 2023 10:27:43 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3E66CC27; Fri, 14 Apr 2023 07:27:00 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id y6so17360091plp.2; Fri, 14 Apr 2023 07:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681482418; x=1684074418; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dmHQTVbWw4YJlaNGarehFo5gsGgag0lLmmC2sUNSy5Y=; b=Rg/JhHpRFP6NIoGwm8nSvdS97FJcuj54dkHXyU7p5Et6BsX8Y9CS79uGrfUmlS+JAL hPa+a+/WivLZ888Jgxh5bYaernMZCgxjwb6UiAJIg+o7J8xsb26hyzcAQGcdrRDrWPni qAx/B+FYNu3GQx+8coAUZg7QUzb86cyQ6xyA9cGOrVU+5F5TvZM8zXeYjs3NfX13Npqo bUkbOI/uzjodhpPk7flzky6o5Bn/RWd94kPtge56rFxlfVjiZwagW4fP1Ra7E8uq6m3p xS7lqJYVqdKFKA4LdZGqWW4hjdpjXCYPbiDtCuRbf0Arcl1cQTSkZlFTBu9DDfP/aL+W WL1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681482418; x=1684074418; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dmHQTVbWw4YJlaNGarehFo5gsGgag0lLmmC2sUNSy5Y=; b=iUonSlTBPWbCa/9d8ZmkzWFyhNV/hGokH5v4WSwwzG/8Wrgyl3uWlj6JYpmm3bpvNf CKRVBIUqLeAlVxKPXtUBZZS866iM5K0drTPyjdfio/cU5pC6DpH2lABeUoMpS9+ovGTW MrQmgqy6lOzSwigaddGyHonSTiWEaQ9uMROXiZKOAl1mpb93yWtKoXRs6D8ihfMcLYKz dmvxYJbPuGJMa+H1xGWb77dO1DOJGX1/RL4PbU99ZdrgIhLP4MZJuM2Qj8agA6fbtjf8 ykzWrXZal0s/X10bzQZR72xSLHbyLjGfo0yLdtQEpUqJBXXBszn9KPiSA3YsRAE6oJR6 WeYQ== X-Gm-Message-State: AAQBX9cUDE+ZEshlaugIeQ1+dqhesHHPgs0B6Ujm2lBwdG6U/dm86n/8 ucdpLC/Yi5Yma7WhPTijwPY= X-Received: by 2002:a17:90a:7443:b0:247:271:c3f4 with SMTP id o3-20020a17090a744300b002470271c3f4mr5656869pjk.2.1681482418095; Fri, 14 Apr 2023 07:26:58 -0700 (PDT) Received: from strix-laptop.. (2001-b011-20e0-1499-8303-7502-d3d7-e13b.dynamic-ip6.hinet.net. [2001:b011:20e0:1499:8303:7502:d3d7:e13b]) by smtp.googlemail.com with ESMTPSA id h7-20020a17090ac38700b0022335f1dae2sm2952386pjt.22.2023.04.14.07.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Apr 2023 07:26:57 -0700 (PDT) From: Chih-En Lin To: Andrew Morton , Qi Zheng , David Hildenbrand , "Matthew Wilcox (Oracle)" , Christophe Leroy , John Hubbard , Nadav Amit , Barry Song , Pasha Tatashin Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Steven Rostedt , Masami Hiramatsu , Peter Zijlstra , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , Yu Zhao , Steven Barrett , Juergen Gross , Peter Xu , Kefeng Wang , Tong Tiangen , Christoph Hellwig , "Liam R. Howlett" , Yang Shi , Vlastimil Babka , Alex Sierra , Vincent Whitchurch , Anshuman Khandual , Li kunyu , Liu Shixin , Hugh Dickins , Minchan Kim , Joey Gouly , Chih-En Lin , Michal Hocko , Suren Baghdasaryan , "Zach O'Keefe" , Gautam Menghani , Catalin Marinas , Mark Brown , "Eric W. Biederman" , Andrei Vagin , Shakeel Butt , Daniel Bristot de Oliveira , "Jason A. Donenfeld" , Greg Kroah-Hartman , Alexey Gladkov , x86@kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dinglan Peng , Pedro Fonseca , Jim Huang , Huichun Feng Subject: [PATCH v5 17/17] mm: Check the unexpected modification of COW-ed PTE Date: Fri, 14 Apr 2023 22:23:41 +0800 Message-Id: <20230414142341.354556-18-shiyn.lin@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230414142341.354556-1-shiyn.lin@gmail.com> References: <20230414142341.354556-1-shiyn.lin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In the most of the cases, we don't expect any write access to COW-ed PTE table. To prevent this, add the new modification check to the page table check. But, there are still some of valid reasons where we might want to modify COW-ed PTE tables. Therefore, add the enable/disable function to the check. Signed-off-by: Chih-En Lin --- arch/x86/include/asm/pgtable.h | 1 + include/linux/page_table_check.h | 62 ++++++++++++++++++++++++++++++++ mm/memory.c | 4 +++ mm/page_table_check.c | 58 ++++++++++++++++++++++++++++++ 4 files changed, 125 insertions(+) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 7425f32e5293..6b323c672e36 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1022,6 +1022,7 @@ static inline pud_t native_local_pudp_get_and_clear(pud_t *pudp) static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { + cowed_pte_table_check_modify(mm, addr, ptep, pte); page_table_check_pte_set(mm, addr, ptep, pte); set_pte(ptep, pte); } diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h index 01e16c7696ec..4a54dc454281 100644 --- a/include/linux/page_table_check.h +++ b/include/linux/page_table_check.h @@ -113,6 +113,54 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, __page_table_check_pte_clear_range(mm, addr, pmd); } +#ifdef CONFIG_COW_PTE +void __check_cowed_pte_table_enable(pte_t *ptep); +void __check_cowed_pte_table_disable(pte_t *ptep); +void __cowed_pte_table_check_modify(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); + +static inline void check_cowed_pte_table_enable(pte_t *ptep) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __check_cowed_pte_table_enable(ptep); +} + +static inline void check_cowed_pte_table_disable(pte_t *ptep) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __check_cowed_pte_table_disable(ptep); +} + +static inline void cowed_pte_table_check_modify(struct mm_struct *mm, + unsigned long addr, + pte_t *ptep, pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __cowed_pte_table_check_modify(mm, addr, ptep, pte); +} +#else +static inline void check_cowed_pte_table_enable(pte_t *ptep) +{ +} + +static inline void check_cowed_pte_table_disable(pte_t *ptep) +{ +} + +static inline void cowed_pte_table_check_modify(struct mm_struct *mm, + unsigned long addr, + pte_t *ptep, pte_t pte) +{ +} +#endif /* CONFIG_COW_PTE */ + + #else static inline void page_table_check_alloc(struct page *page, unsigned int order) @@ -162,5 +210,19 @@ static inline void page_table_check_pte_clear_range(struct mm_struct *mm, { } +static inline void check_cowed_pte_table_enable(pte_t *ptep) +{ +} + +static inline void check_cowed_pte_table_disable(pte_t *ptep) +{ +} + +static inline void cowed_pte_table_check_modify(struct mm_struct *mm, + unsigned long addr, + pte_t *ptep, pte_t pte) +{ +} + #endif /* CONFIG_PAGE_TABLE_CHECK */ #endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/memory.c b/mm/memory.c index 7908e20f802a..e62487413038 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1202,10 +1202,12 @@ copy_cow_pte_range(struct vm_area_struct *dst_vma, * Although, parent's PTE is COW-ed, we should * still need to handle all the swap stuffs. */ + check_cowed_pte_table_disable(src_pte); ret = copy_nonpresent_pte(dst_mm, src_mm, src_pte, src_pte, curr, curr, addr, rss); + check_cowed_pte_table_enable(src_pte); if (ret == -EIO) { entry = pte_to_swp_entry(*src_pte); break; @@ -1223,8 +1225,10 @@ copy_cow_pte_range(struct vm_area_struct *dst_vma, * copy_present_pte() will determine the mapped page * should be COW mapping or not. */ + check_cowed_pte_table_disable(src_pte); ret = copy_present_pte(curr, curr, src_pte, src_pte, addr, rss, NULL); + check_cowed_pte_table_enable(src_pte); /* * If we need a pre-allocated page for this pte, * drop the lock, recover all the entries, fall diff --git a/mm/page_table_check.c b/mm/page_table_check.c index 25d8610c0042..5175c7476508 100644 --- a/mm/page_table_check.c +++ b/mm/page_table_check.c @@ -14,6 +14,9 @@ struct page_table_check { atomic_t anon_map_count; atomic_t file_map_count; +#ifdef CONFIG_COW_PTE + atomic_t check_cowed_pte; +#endif }; static bool __page_table_check_enabled __initdata = @@ -248,3 +251,58 @@ void __page_table_check_pte_clear_range(struct mm_struct *mm, pte_unmap(ptep - PTRS_PER_PTE); } } + +#ifdef CONFIG_COW_PTE +void __check_cowed_pte_table_enable(pte_t *ptep) +{ + struct page *page = pte_page(*ptep); + struct page_ext *page_ext = page_ext_get(page); + struct page_table_check *ptc = get_page_table_check(page_ext); + + atomic_set(&ptc->check_cowed_pte, 1); + page_ext_put(page_ext); +} + +void __check_cowed_pte_table_disable(pte_t *ptep) +{ + struct page *page = pte_page(*ptep); + struct page_ext *page_ext = page_ext_get(page); + struct page_table_check *ptc = get_page_table_check(page_ext); + + atomic_set(&ptc->check_cowed_pte, 0); + page_ext_put(page_ext); +} + +static int check_cowed_pte_table(pte_t *ptep) +{ + struct page *page = pte_page(*ptep); + struct page_ext *page_ext = page_ext_get(page); + struct page_table_check *ptc = get_page_table_check(page_ext); + int check = 0; + + check = atomic_read(&ptc->check_cowed_pte); + page_ext_put(page_ext); + + return check; +} + +void __cowed_pte_table_check_modify(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + if (!test_bit(MMF_COW_PTE, &mm->flags) || !check_cowed_pte_table(ptep)) + return; + + pgd = pgd_offset(mm, addr); + p4d = p4d_offset(pgd, addr); + pud = pud_offset(p4d, addr); + pmd = pmd_offset(pud, addr); + + if (!pmd_none(*pmd) && !pmd_write(*pmd) && cow_pte_count(pmd) > 1) + BUG_ON(!pte_same(*ptep, pte)); +} +#endif -- 2.34.1