Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1367687pxb; Thu, 28 Jan 2021 14:53:00 -0800 (PST) X-Google-Smtp-Source: ABdhPJyoPVWD1LoB6g2DzETK2wN4iz4NWTT1x0UEaPsT1rs9fH0ImPlck81P2PlFbtfvFKUpmTHe X-Received: by 2002:a05:6402:22a8:: with SMTP id cx8mr2157278edb.32.1611874380745; Thu, 28 Jan 2021 14:53:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611874380; cv=none; d=google.com; s=arc-20160816; b=cQZC+byh0lX0bvazPjlM8imge8RphMFWLbq0YIiPHrlrHZJ13qnjFOyyliBFjiAUFD HfZmh5A3Lt2iZS2s9BkkeeUDlOGVJtic/ilqP4glyk+4YfncZ6ftcCN3kSVviZGfnwhI iOedVTFjR3Dd6tEL7YG4qLUGSbgRXfQC2JOlUAis0ZxNzOkUMXnqnkJIixFd+d5YEqKK K3SdyiI8/718Zk4T+nA8J8xn78risonUtLqNcU7EJDscZ04gfy4N4hfrb31RmGGE1e11 ALWzgDtuKcg2y1QKjsQHJQsZZAPe3Qd/0/o/jKCfSTozkQRjWclnxUum6maiesYGQzdT RbQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:references:mime-version :message-id:in-reply-to:date:sender:dkim-signature; bh=nn/AXouLs1Xyv3qIgnzAnN+i0fFSeddiyUOA8V44Z+4=; b=wlbFrEnsx+JKwVXlkii+/qqp5NVD2DHM5yioJF+rGwVX5L1sNyXteV9j896baqBPjS +QnA+CpvRrPkKlFz+CHw/8jfbEQo1L+QBSdpB/0YNcZGa1lBFPGUhuFsYK74CUSOwby0 8CrG1DRtskOJBy/6FNWU8Jk+R0XdqSamP97n7t8yKczDddxmTwd52+2CrG94Y5j9mXHU FvdUohBg7EzMk3bViGePXs+eQ6zVIYAsI7EOVf/7IOMT8Otmp/GO+PWjH+ZI6Qfe+wYJ qrkAcnau7HzufWrI6qhAEOcJU/L2XWgI0BmfTB8S7I0DVIiRFmrFKyFAVXQQ5OpE3XwF X6oA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JK6sypdK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cn3si4322131edb.69.2021.01.28.14.52.36; Thu, 28 Jan 2021 14:53:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JK6sypdK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231684AbhA1Wvs (ORCPT + 99 others); Thu, 28 Jan 2021 17:51:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231649AbhA1WuW (ORCPT ); Thu, 28 Jan 2021 17:50:22 -0500 Received: from mail-qk1-x749.google.com (mail-qk1-x749.google.com [IPv6:2607:f8b0:4864:20::749]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53E52C061786 for ; Thu, 28 Jan 2021 14:48:30 -0800 (PST) Received: by mail-qk1-x749.google.com with SMTP id p185so5534404qkc.9 for ; Thu, 28 Jan 2021 14:48:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=sender:date:in-reply-to:message-id:mime-version:references:subject :from:to:cc; bh=nn/AXouLs1Xyv3qIgnzAnN+i0fFSeddiyUOA8V44Z+4=; b=JK6sypdKerm6VcSf0tyX+37EbWFaXjKFvVlhiSES9bfigMympYv4lMlJLHMWaQummf sTqk652Ddutlh6NPOJTt52PBFbL7/UmlBCl/idOiJtIizi7Edm0pxxRHfnQhg28qKWoF Qj/r61yU0UJeRlupIX2g2ix0+/ApBbOQ75/LiEbQAnUn3K/NwltLMIWu52uwhrJOHXvs +FVqijukG5SFHsoPV4vF+wBB2MplJCDGEvef+vfjBGGt9BpuYxs2g50QSS4JW2t6ahu8 CXoegAazed0CaARShK+YQLHbYdEn/7gqarhOXp3/qx6FBLaLythi2f/iQW2jm/3Q8CZt 5TeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=nn/AXouLs1Xyv3qIgnzAnN+i0fFSeddiyUOA8V44Z+4=; b=rkWiW5Xhi4XdXWaguo8EJXs29WnSf1mvyAhZq5VHzter8iG94YazMnt49YdJZF+pPR D1zxRudJb/pYJBBzDswkooO8HZmIm5EG1VP9iEwyHKms2xt+UzPWbzrT8Jlb7mMAOYzA CeLvJ53lfxZjYyEF93ebaXG44ggEzZoDBkBc/aaNuY6npWCJXLAmXjY1sQNZiB8vTxPK mbFgrnpCXI2USJe8ZoQ7YV+O5HwvzCzbe9xjmGsuVJHh5aLQFNDqjMHJ+UBPRBMWieMs rXBpL12vW9qFW2JWPWL4jQ8Zsf+lL21eL4sjz5+/+k/GX8rXOFzPUEQcnyYcCsguYNRj bFpg== X-Gm-Message-State: AOAM530/UD4+l3NkNKAYhtJ+8SU6VW3bRuLEgN0ehqOAYlj/FxTAUnSz nXC2k1EhqFPjy3zA0XkvYDKevjp6mMsbBMeC4J1S Sender: "axelrasmussen via sendgmr" X-Received: from ajr0.svl.corp.google.com ([2620:15c:2cd:203:f693:9fff:feef:c8f8]) (user=axelrasmussen job=sendgmr) by 2002:a0c:ab1a:: with SMTP id h26mr1634062qvb.26.1611874109380; Thu, 28 Jan 2021 14:48:29 -0800 (PST) Date: Thu, 28 Jan 2021 14:48:12 -0800 In-Reply-To: <20210128224819.2651899-1-axelrasmussen@google.com> Message-Id: <20210128224819.2651899-3-axelrasmussen@google.com> Mime-Version: 1.0 References: <20210128224819.2651899-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.30.0.365.g02bc693789-goog Subject: [PATCH v3 2/9] hugetlb/userfaultfd: Forbid huge pmd sharing when uffd enabled From: Axel Rasmussen To: Alexander Viro , Alexey Dobriyan , Andrea Arcangeli , Andrew Morton , Anshuman Khandual , Catalin Marinas , Chinwen Chang , Huang Ying , Ingo Molnar , Jann Horn , Jerome Glisse , Lokesh Gidra , "Matthew Wilcox (Oracle)" , Michael Ellerman , "=?UTF-8?q?Michal=20Koutn=C3=BD?=" , Michel Lespinasse , Mike Kravetz , Mike Rapoport , Nicholas Piggin , Peter Xu , Shaohua Li , Shawn Anastasio , Steven Rostedt , Steven Price , Vlastimil Babka Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Adam Ruprecht , Axel Rasmussen , Cannon Matthews , "Dr . David Alan Gilbert" , David Rientjes , Oliver Upton Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Xu Huge pmd sharing could bring problem to userfaultfd. The thing is that userfaultfd is running its logic based on the special bits on page table entries, however the huge pmd sharing could potentially share page table entries for different address ranges. That could cause issues on either: - When sharing huge pmd page tables for an uffd write protected range, the newly mapped huge pmd range will also be write protected unexpectedly, or, - When we try to write protect a range of huge pmd shared range, we'll first do huge_pmd_unshare() in hugetlb_change_protection(), however that also means the UFFDIO_WRITEPROTECT could be silently skipped for the shared region, which could lead to data loss. Since at it, a few other things are done altogether: - Move want_pmd_share() from mm/hugetlb.c into linux/hugetlb.h, because that's definitely something that arch code would like to use too - ARM64 currently directly check against CONFIG_ARCH_WANT_HUGE_PMD_SHARE when trying to share huge pmd. Switch to the want_pmd_share() helper. Signed-off-by: Peter Xu Signed-off-by: Axel Rasmussen --- arch/arm64/mm/hugetlbpage.c | 3 +-- include/linux/hugetlb.h | 15 +++++++++++++++ include/linux/userfaultfd_k.h | 9 +++++++++ mm/hugetlb.c | 5 ++--- 4 files changed, 27 insertions(+), 5 deletions(-) diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 5b32ec888698..1a8ce0facfe8 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -284,8 +284,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, */ ptep = pte_alloc_map(mm, pmdp, addr); } else if (sz == PMD_SIZE) { - if (IS_ENABLED(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && - pud_none(READ_ONCE(*pudp))) + if (want_pmd_share(vma) && pud_none(READ_ONCE(*pudp))) ptep = huge_pmd_share(mm, addr, pudp); else ptep = (pte_t *)pmd_alloc(mm, pudp, addr); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 1e0abb609976..4508136c8376 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -11,6 +11,7 @@ #include #include #include +#include struct ctl_table; struct user_struct; @@ -947,4 +948,18 @@ static inline __init void hugetlb_cma_check(void) } #endif +static inline bool want_pmd_share(struct vm_area_struct *vma) +{ +#ifdef CONFIG_USERFAULTFD + if (uffd_disable_huge_pmd_share(vma)) + return false; +#endif + +#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE + return true; +#else + return false; +#endif +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index a8e5f3ea9bb2..c63ccdae3eab 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -52,6 +52,15 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma, return vma->vm_userfaultfd_ctx.ctx == vm_ctx.ctx; } +/* + * Never enable huge pmd sharing on uffd-wp registered vmas, because uffd-wp + * protect information is per pgtable entry. + */ +static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_UFFD_WP; +} + static inline bool userfaultfd_missing(struct vm_area_struct *vma) { return vma->vm_flags & VM_UFFD_MISSING; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 07b23c81b1db..d46f50a99ff1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5371,7 +5371,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, *addr = ALIGN(*addr, HPAGE_SIZE * PTRS_PER_PTE) - HPAGE_SIZE; return 1; } -#define want_pmd_share() (1) + #else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ pte_t *huge_pmd_share(struct mm_struct *mm, unsigned long addr, pud_t *pud) { @@ -5388,7 +5388,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, unsigned long *start, unsigned long *end) { } -#define want_pmd_share() (0) #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB @@ -5410,7 +5409,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, pte = (pte_t *)pud; } else { BUG_ON(sz != PMD_SIZE); - if (want_pmd_share() && pud_none(*pud)) + if (want_pmd_share(vma) && pud_none(*pud)) pte = huge_pmd_share(mm, addr, pud); else pte = (pte_t *)pmd_alloc(mm, pud, addr); -- 2.30.0.365.g02bc693789-goog