Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp2720397ybl; Thu, 29 Aug 2019 11:56:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqxhtpvkn9uyKICZkp+UBiuy+JeVvs4ssJ6MPSdVaeSMCfVDN1LApp/PDw3WuejmTDj272NQ X-Received: by 2002:a17:90a:b781:: with SMTP id m1mr10956726pjr.141.1567105010050; Thu, 29 Aug 2019 11:56:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567105010; cv=none; d=google.com; s=arc-20160816; b=bhLGO723vrwjV7h2p4OEZ6JKuSy6Oq55yHFrA20F/bsX9+ZzGnC0uL8qj0u8OBolDT ylb+ssDLBcMDcpig5oPzPCjvjxP9JHfU8tm8Cw3Oir0bxDO4X4pDaE09EK6oYBJhQCQy yVqTf2fz3sEkq9I3IhYpZNAL1c+BZbQ/bJKGEZtRHSSoZ0aNVlxY1R3CvlzAXJsbI2Ki xsVn65gFC/w6KSfhkQF6avdOf27JUp3+eqKeALJqAcjJNP6zwyJG6u0v78GD7xa0Z6dP W63hF1hYgKRqhADfZvn52QLXYTmSVJTuafuYoNSQdKpcQcafaE3WAjyFC69KLEc4wAy5 xETQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :robot-unsubscribe:robot-id:message-id:mime-version:references :in-reply-to:cc:subject:to:reply-to:from:date; bh=e0VYgqjP7ERwjJGPiZwcet7sGivOTmY9iscdzZgPtYw=; b=ZNfFGhFNBwR77SJATEyK4MGfHrB5gx8pOnDkq8Y4NQQYO8xiUHPljC+KPX2oZNXzSr 5UzpSLwLaeIkc07VvwR2D3+mw71Rj0o3b+Y1Stw0wu0905gheM9g8Y0yOSYijlUQ1xy1 6HNBmFDenwjV0Nqac5eACQM06afx/UBqF0JnJy6YES50Wb7NxPuYLzQjtJJSQ22V0DCq XPerDQ/Glu5jptZtY1EGfP1EcosQmIpENn27w84j3Q5blAhxUAVt56NH/clq8EGUfPxz +gejANSaSkiOXmcKs8tNdtmGu/vuwwGWO3sMkLK66HPTmrw/O61FrZvuz6cpg4ERFMzb 0FZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ck3si2617364plb.429.2019.08.29.11.56.34; Thu, 29 Aug 2019 11:56:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728232AbfH2Szi (ORCPT + 99 others); Thu, 29 Aug 2019 14:55:38 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:51274 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727146AbfH2Szi (ORCPT ); Thu, 29 Aug 2019 14:55:38 -0400 Received: from [5.158.153.53] (helo=tip-bot2.lab.linutronix.de) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1i3PZt-0004wC-NS; Thu, 29 Aug 2019 20:55:25 +0200 Received: from [127.0.1.1] (localhost [IPv6:::1]) by tip-bot2.lab.linutronix.de (Postfix) with ESMTP id B6CFA1C07C3; Thu, 29 Aug 2019 20:55:24 +0200 (CEST) Date: Thu, 29 Aug 2019 18:55:24 -0000 From: "tip-bot2 for Thomas Gleixner" Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/urgent] x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text Cc: Song Liu , Thomas Gleixner , "Peter Zijlstra (Intel)" , stable@vger.kernel.org, Ingo Molnar , Borislav Petkov , linux-kernel@vger.kernel.org In-Reply-To: References: MIME-Version: 1.0 Message-ID: <156710492460.9654.8831877201682543400.tip-bot2@tip-bot2> X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the x86/urgent branch of tip: Commit-ID: 7af0145067bc429a09ac4047b167c0971c9f0dc7 Gitweb: https://git.kernel.org/tip/7af0145067bc429a09ac4047b167c0971c9f0dc7 Author: Thomas Gleixner AuthorDate: Thu, 29 Aug 2019 00:31:34 +02:00 Committer: Thomas Gleixner CommitterDate: Thu, 29 Aug 2019 20:48:44 +02:00 x86/mm/cpa: Prevent large page split when ftrace flips RW on kernel text ftrace does not use text_poke() for enabling trace functionality. It uses its own mechanism and flips the whole kernel text to RW and back to RO. The CPA rework removed a loop based check of 4k pages which tried to preserve a large page by checking each 4k page whether the change would actually cover all pages in the large page. This resulted in endless loops for nothing as in testing it turned out that it actually never preserved anything. Of course testing missed to include ftrace, which is the one and only case which benefitted from the 4k loop. As a consequence enabling function tracing or ftrace based kprobes results in a full 4k split of the kernel text, which affects iTLB performance. The kernel RO protection is the only valid case where this can actually preserve large pages. All other static protections (RO data, data NX, PCI, BIOS) are truly static. So a conflict with those protections which results in a split should only ever happen when a change of memory next to a protected region is attempted. But these conflicts are rightfully splitting the large page to preserve the protected regions. In fact a change to the protected regions itself is a bug and is warned about. Add an exception for the static protection check for kernel text RO when the to be changed region spawns a full large page which allows to preserve the large mappings. This also prevents the syslog to be spammed about CPA violations when ftrace is used. The exception needs to be removed once ftrace switched over to text_poke() which avoids the whole issue. Fixes: 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely") Reported-by: Song Liu Signed-off-by: Thomas Gleixner Tested-by: Song Liu Reviewed-by: Song Liu Acked-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908282355340.1938@nanos.tec.linutronix.de --- arch/x86/mm/pageattr.c | 26 ++++++++++++++++++-------- 1 file changed, 18 insertions(+), 8 deletions(-) diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 6a9a77a..e14e95e 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -516,7 +516,7 @@ static inline void check_conflict(int warnlvl, pgprot_t prot, pgprotval_t val, */ static inline pgprot_t static_protections(pgprot_t prot, unsigned long start, unsigned long pfn, unsigned long npg, - int warnlvl) + unsigned long lpsize, int warnlvl) { pgprotval_t forbidden, res; unsigned long end; @@ -535,9 +535,17 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long start, check_conflict(warnlvl, prot, res, start, end, pfn, "Text NX"); forbidden = res; - res = protect_kernel_text_ro(start, end); - check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO"); - forbidden |= res; + /* + * Special case to preserve a large page. If the change spawns the + * full large page mapping then there is no point to split it + * up. Happens with ftrace and is going to be removed once ftrace + * switched to text_poke(). + */ + if (lpsize != (npg * PAGE_SIZE) || (start & (lpsize - 1))) { + res = protect_kernel_text_ro(start, end); + check_conflict(warnlvl, prot, res, start, end, pfn, "Text RO"); + forbidden |= res; + } /* Check the PFN directly */ res = protect_pci_bios(pfn, pfn + npg - 1); @@ -819,7 +827,7 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address, * extra conditional required here. */ chk_prot = static_protections(old_prot, lpaddr, old_pfn, numpages, - CPA_CONFLICT); + psize, CPA_CONFLICT); if (WARN_ON_ONCE(pgprot_val(chk_prot) != pgprot_val(old_prot))) { /* @@ -855,7 +863,7 @@ static int __should_split_large_page(pte_t *kpte, unsigned long address, * protection requirement in the large page. */ new_prot = static_protections(req_prot, lpaddr, old_pfn, numpages, - CPA_DETECT); + psize, CPA_DETECT); /* * If there is a conflict, split the large page. @@ -906,7 +914,8 @@ static void split_set_pte(struct cpa_data *cpa, pte_t *pte, unsigned long pfn, if (!cpa->force_static_prot) goto set; - prot = static_protections(ref_prot, address, pfn, npg, CPA_PROTECT); + /* Hand in lpsize = 0 to enforce the protection mechanism */ + prot = static_protections(ref_prot, address, pfn, npg, 0, CPA_PROTECT); if (pgprot_val(prot) == pgprot_val(ref_prot)) goto set; @@ -1503,7 +1512,8 @@ static int __change_page_attr(struct cpa_data *cpa, int primary) pgprot_val(new_prot) |= pgprot_val(cpa->mask_set); cpa_inc_4k_install(); - new_prot = static_protections(new_prot, address, pfn, 1, + /* Hand in lpsize = 0 to enforce the protection mechanism */ + new_prot = static_protections(new_prot, address, pfn, 1, 0, CPA_PROTECT); new_prot = pgprot_clear_protnone_bits(new_prot);