Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp280120pxb; Thu, 25 Feb 2021 02:15:14 -0800 (PST) X-Google-Smtp-Source: ABdhPJyCVImwqoYQpwi308LMlOvSaKbvoU4dWyWjlMh0/J/LYhC7sN4rGFp8xPoQ0ZVoYaYmDQj8 X-Received: by 2002:a17:906:d104:: with SMTP id b4mr1984622ejz.390.1614248114716; Thu, 25 Feb 2021 02:15:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614248114; cv=none; d=google.com; s=arc-20160816; b=PRpuOxFvGEFB5rgrU5OibuXtiu6F68EWyUszMs1Y92RT/fX6NHMupN4Wg1D+Vk0NEu evDDEfBF8SUHcLcQOtxD3YxbFXupZZ2lx/CoybGqrMfZeZ1+JwLOxQVIS4HgXn4/H56A jd6JDp+2eLGECuZzSI1D6R7pmukwHvFFyzQ6WEwMJvcG/JncCJAD2P8Bxad5Mzil8XKt JPgdqjs3V/tEQIWXfCX5Fi9NAArXjoGVc2nNUhpFYuAXq7uKy+55q1wOHsqy2LYr80Ei QLjpm476BkuwNtIj48/Bs8x6fezl5rXbMB5M6SQ9iJRG/SBPBTtt2lNwxx78j3ukDtUM U2Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=l1z7o83HFA6Q2Lmj11jq8jmt08g/pN4v4uxAMwzLm7Q=; b=kWCqDk61UYPF7TvgOPoYDAVCgulpVb3dReuIvMRmk6icz+0vba394D0hvpGMp3dHhN 4mNez1F6qKmlWbJDZpptT/47HJCK9sUdoxCOaRm5hNvO7iAAzK9r+RMhJ+uZZkEJ0kkG 95NZmXC7BqByRQbO1XMEFNeIe6leIYFp7EMHApY/WWQyxIxhDKgAmr+zlQ6MZe2EYRRq RwSItFEGQkYvJrn6O9q77RDZHsqvzCiLnmn1nd4Rfr/pnP6p3GdVEcN554WOLixxF98W T2jTw230qjrnY9uKAn2RpFz1h0MPaywfWXRnQdsRRv+aU7EWpt2lX/aiTrpnGI4fdhXd Z/rQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=aZaz5GGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u5si2935823edv.181.2021.02.25.02.14.52; Thu, 25 Feb 2021 02:15:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=aZaz5GGD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233805AbhBYKMR (ORCPT + 99 others); Thu, 25 Feb 2021 05:12:17 -0500 Received: from mail.kernel.org ([198.145.29.99]:34284 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235517AbhBYJ7G (ORCPT ); Thu, 25 Feb 2021 04:59:06 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id E57E064EC3; Thu, 25 Feb 2021 09:55:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1614246920; bh=U7Yj+iOZQU8u6N+sBrDo4xZ6jL5p/QVr07RrQ7KlbAs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aZaz5GGD7lLRhku/xsV4HIMwIalKzmtLTIcESuEEdZuMXLCwfi8f8a41CVZ8kpu61 oFlMjxdJMbg8k1beJOn7r62mdjNY4JPT9tKbcRaFOoBBhdnbvNi4hrBH27MDF68xbY 78QR7Qz3JXk8kcqoWQAXl5u+6xVt3TEEa8brBh4o= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Jason Gunthorpe , Paolo Bonzini Subject: [PATCH 5.10 16/23] mm: provide a saner PTE walking API for modules Date: Thu, 25 Feb 2021 10:53:47 +0100 Message-Id: <20210225092517.305267434@linuxfoundation.org> X-Mailer: git-send-email 2.30.1 In-Reply-To: <20210225092516.531932232@linuxfoundation.org> References: <20210225092516.531932232@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Paolo Bonzini commit 9fd6dad1261a541b3f5fa7dc5b152222306e6702 upstream. Currently, the follow_pfn function is exported for modules but follow_pte is not. However, follow_pfn is very easy to misuse, because it does not provide protections (so most of its callers assume the page is writable!) and because it returns after having already unlocked the page table lock. Provide instead a simplified version of follow_pte that does not have the pmdpp and range arguments. The older version survives as follow_invalidate_pte() for use by fs/dax.c. Reviewed-by: Jason Gunthorpe Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman --- fs/dax.c | 5 +++-- include/linux/mm.h | 6 ++++-- mm/memory.c | 41 ++++++++++++++++++++++++++++++++++++----- virt/kvm/kvm_main.c | 4 ++-- 4 files changed, 45 insertions(+), 11 deletions(-) --- a/fs/dax.c +++ b/fs/dax.c @@ -810,11 +810,12 @@ static void dax_entry_mkclean(struct add address = pgoff_address(index, vma); /* - * Note because we provide range to follow_pte it will call + * follow_invalidate_pte() will use the range to call * mmu_notifier_invalidate_range_start() on our behalf before * taking any lock. */ - if (follow_pte(vma->vm_mm, address, &range, &ptep, &pmdp, &ptl)) + if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep, + &pmdp, &ptl)) continue; /* --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1655,9 +1655,11 @@ void free_pgd_range(struct mmu_gather *t unsigned long end, unsigned long floor, unsigned long ceiling); int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma); +int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, + struct mmu_notifier_range *range, pte_t **ptepp, + pmd_t **pmdpp, spinlock_t **ptlp); int follow_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, - spinlock_t **ptlp); + pte_t **ptepp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, unsigned long *pfn); int follow_phys(struct vm_area_struct *vma, unsigned long address, --- a/mm/memory.c +++ b/mm/memory.c @@ -4707,9 +4707,9 @@ int __pmd_alloc(struct mm_struct *mm, pu } #endif /* __PAGETABLE_PMD_FOLDED */ -int follow_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, pmd_t **pmdpp, - spinlock_t **ptlp) +int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, + struct mmu_notifier_range *range, pte_t **ptepp, + pmd_t **pmdpp, spinlock_t **ptlp) { pgd_t *pgd; p4d_t *p4d; @@ -4775,6 +4775,34 @@ out: } /** + * follow_pte - look up PTE at a user virtual address + * @mm: the mm_struct of the target address space + * @address: user virtual address + * @ptepp: location to store found PTE + * @ptlp: location to store the lock for the PTE + * + * On a successful return, the pointer to the PTE is stored in @ptepp; + * the corresponding lock is taken and its location is stored in @ptlp. + * The contents of the PTE are only stable until @ptlp is released; + * any further use, if any, must be protected against invalidation + * with MMU notifiers. + * + * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore + * should be taken for read. + * + * KVM uses this function. While it is arguably less bad than ``follow_pfn``, + * it is not a good general-purpose API. + * + * Return: zero on success, -ve otherwise. + */ +int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) +{ + return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp); +} +EXPORT_SYMBOL_GPL(follow_pte); + +/** * follow_pfn - look up PFN at a user virtual address * @vma: memory mapping * @address: user virtual address @@ -4782,6 +4810,9 @@ out: * * Only IO mappings and raw PFN mappings are allowed. * + * This function does not allow the caller to read the permissions + * of the PTE. Do not use it. + * * Return: zero and the pfn at @pfn on success, -ve otherwise. */ int follow_pfn(struct vm_area_struct *vma, unsigned long address, @@ -4794,7 +4825,7 @@ int follow_pfn(struct vm_area_struct *vm if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) return ret; - ret = follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl); + ret = follow_pte(vma->vm_mm, address, &ptep, &ptl); if (ret) return ret; *pfn = pte_pfn(*ptep); @@ -4815,7 +4846,7 @@ int follow_phys(struct vm_area_struct *v if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) goto out; - if (follow_pte(vma->vm_mm, address, NULL, &ptep, NULL, &ptl)) + if (follow_pte(vma->vm_mm, address, &ptep, &ptl)) goto out; pte = *ptep; --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1893,7 +1893,7 @@ static int hva_to_pfn_remapped(struct vm spinlock_t *ptl; int r; - r = follow_pte(vma->vm_mm, addr, NULL, &ptep, NULL, &ptl); + r = follow_pte(vma->vm_mm, addr, &ptep, &ptl); if (r) { /* * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does @@ -1908,7 +1908,7 @@ static int hva_to_pfn_remapped(struct vm if (r) return r; - r = follow_pte(vma->vm_mm, addr, NULL, &ptep, NULL, &ptl); + r = follow_pte(vma->vm_mm, addr, &ptep, &ptl); if (r) return r; }