Received: by 2002:ac0:946b:0:0:0:0:0 with SMTP id j40csp3038514imj; Mon, 11 Feb 2019 12:44:45 -0800 (PST) X-Google-Smtp-Source: AHgI3IbIMB7pN3C9UfZx5Ni8mNLs58NcmT5dZHmW5Rh5HGU/n+WS/mChyemnNzaPoRhoHfuSNvBO X-Received: by 2002:a17:902:28e9:: with SMTP id f96mr109187plb.169.1549917885724; Mon, 11 Feb 2019 12:44:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549917885; cv=none; d=google.com; s=arc-20160816; b=uEtp2CX/8r2JP67WyH6fkO0RbziFXCMDz+nNvbmNnN0TxU8LJwXnIMe/B80XkWhObA 7GbeWJBcTWzkWDuFOD3WVOwinE6a2YIFO/WWrEqAQ2A9jR3QimeX3ZfQ3LWzBv1NslWr eegPi1VZup42/g5YzDtU/veGdashrCQqz9fLa5E4v+SN1cJfSzTI2O5NCJQYU5bdhXXX 4JoRe/QHReC+FvU+brVCKCfz7yI7E05zvcuEGAn+KN2EdYZ4yHYAxRMToOtKIuVMnB71 3Cu3NKo7K+BJHkueVCRgRFJYaTa9Wu/aLDuJ020PVPiKGFzwbdHENiqDk8Mbl2rxZO69 Y8Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jVNOA9HfNnBBWM4K63/t8TJ2OL6pPpY/Hp9DuePEfDc=; b=DHdudOBveJk9vaMmycj3UxajB1pahKb0Xil1LRjul52iFO4Ae+su5bUhWtE63LFO0S 201WMb4BOqnaPTq215yYOBunSh1ZPqNZAOep/C3k/dEW82cZ31K+ILxko9p4cBXM0gFD 9PtdKSMJy1hi62sW7wirrIHoqrZDx3ZWks2vx4Q3ekZwpTf0uq8BcxK9TKXTUrdOuFlG xnPC9PUO2ndQQ4KHGW6knL+uI55iJo86DAS7wgIKg/AmmJAcOUBcvnuJgHkuFpxAB9Zt XnKweLXhfRzXLhlJz4cYPYe6H5d426jSNyLIdHg44ip4mLvA+eBk15BxjORmJqnGokXs 0QPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j21si11058771pll.150.2019.02.11.12.44.30; Mon, 11 Feb 2019 12:44:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388609AbfBKURP (ORCPT + 99 others); Mon, 11 Feb 2019 15:17:15 -0500 Received: from mga14.intel.com ([192.55.52.115]:16955 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388541AbfBKURF (ORCPT ); Mon, 11 Feb 2019 15:17:05 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Feb 2019 12:17:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,360,1544515200"; d="scan'208";a="319498288" Received: from iweiny-desk2.sc.intel.com ([10.3.52.157]) by fmsmga005.fm.intel.com with ESMTP; 11 Feb 2019 12:17:05 -0800 From: ira.weiny@intel.com To: linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Daniel Borkmann , Davidlohr Bueso , netdev@vger.kernel.org Cc: Mike Marciniszyn , Dennis Dalessandro , Doug Ledford , Jason Gunthorpe , Andrew Morton , "Kirill A. Shutemov" , Dan Williams , Ira Weiny Subject: [PATCH 2/3] mm/gup: Introduce get_user_pages_fast_longterm() Date: Mon, 11 Feb 2019 12:16:42 -0800 Message-Id: <20190211201643.7599-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com> References: <20190211201643.7599-1-ira.weiny@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Ira Weiny Users of get_user_pages_fast are not protected against mapping pages within FS DAX. Introduce a call which protects them. We do this by checking for DEVMAP pages during the fast walk and falling back to the longterm gup call to check for FS DAX if needed. Signed-off-by: Ira Weiny --- include/linux/mm.h | 8 ++++ mm/gup.c | 102 +++++++++++++++++++++++++++++++++++---------- 2 files changed, 88 insertions(+), 22 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 80bb6408fe73..8f831c823630 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1540,6 +1540,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, long get_user_pages_longterm(unsigned long start, unsigned long nr_pages, unsigned int gup_flags, struct page **pages, struct vm_area_struct **vmas); +int get_user_pages_fast_longterm(unsigned long start, int nr_pages, bool write, + struct page **pages); #else static inline long get_user_pages_longterm(unsigned long start, unsigned long nr_pages, unsigned int gup_flags, @@ -1547,6 +1549,11 @@ static inline long get_user_pages_longterm(unsigned long start, { return get_user_pages(start, nr_pages, gup_flags, pages, vmas); } +static inline int get_user_pages_fast_longterm(unsigned long start, int nr_pages, + bool write, struct page **pages) +{ + return get_user_pages_fast(start, nr_pages, write, pages); +} #endif /* CONFIG_FS_DAX */ int get_user_pages_fast(unsigned long start, int nr_pages, int write, @@ -2615,6 +2622,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ #define FOLL_COW 0x4000 /* internal GUP flag */ #define FOLL_ANON 0x8000 /* don't do file mappings */ +#define FOLL_LONGTERM 0x10000 /* mapping is intended for a long term pin */ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) { diff --git a/mm/gup.c b/mm/gup.c index 894ab014bd1e..f7d86a304405 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1190,6 +1190,21 @@ long get_user_pages_longterm(unsigned long start, unsigned long nr_pages, EXPORT_SYMBOL(get_user_pages_longterm); #endif /* CONFIG_FS_DAX */ +static long get_user_pages_longterm_unlocked(unsigned long start, + unsigned long nr_pages, + struct page **pages, + unsigned int gup_flags) +{ + struct mm_struct *mm = current->mm; + long ret; + + down_read(&mm->mmap_sem); + ret = get_user_pages_longterm(start, nr_pages, gup_flags, pages, NULL); + up_read(&mm->mmap_sem); + + return ret; +} + /** * populate_vma_page_range() - populate a range of pages in the vma. * @vma: target vma @@ -1417,6 +1432,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, goto pte_unmap; if (pte_devmap(pte)) { + if (flags & FOLL_LONGTERM) + goto pte_unmap; + pgmap = get_dev_pagemap(pte_pfn(pte), pgmap); if (unlikely(!pgmap)) { undo_dev_pagemap(nr, nr_start, pages); @@ -1556,8 +1574,12 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr, if (!pmd_access_permitted(orig, flags & FOLL_WRITE)) return 0; - if (pmd_devmap(orig)) + if (pmd_devmap(orig)) { + if (flags & FOLL_LONGTERM) + return 0; + return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr); + } refs = 0; page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); @@ -1837,24 +1859,9 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write, return nr; } -/** - * get_user_pages_fast() - pin user pages in memory - * @start: starting user address - * @nr_pages: number of pages from start to pin - * @write: whether pages will be written to - * @pages: array that receives pointers to the pages pinned. - * Should be at least nr_pages long. - * - * Attempt to pin user pages in memory without taking mm->mmap_sem. - * If not successful, it will fall back to taking the lock and - * calling get_user_pages(). - * - * Returns number of pages pinned. This may be fewer than the number - * requested. If nr_pages is 0 or negative, returns 0. If no pages - * were pinned, returns -errno. - */ -int get_user_pages_fast(unsigned long start, int nr_pages, int write, - struct page **pages) +static int __get_user_pages_fast_flags(unsigned long start, int nr_pages, + unsigned int gup_flags, + struct page **pages) { unsigned long addr, len, end; int nr = 0, ret = 0; @@ -1872,7 +1879,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write, if (gup_fast_permitted(start, nr_pages)) { local_irq_disable(); - gup_pgd_range(addr, end, write ? FOLL_WRITE : 0, pages, &nr); + gup_pgd_range(addr, end, gup_flags, pages, &nr); local_irq_enable(); ret = nr; } @@ -1882,8 +1889,14 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write, start += nr << PAGE_SHIFT; pages += nr; - ret = get_user_pages_unlocked(start, nr_pages - nr, pages, - write ? FOLL_WRITE : 0); + if (gup_flags & FOLL_LONGTERM) + ret = get_user_pages_longterm_unlocked(start, + nr_pages - nr, + pages, + gup_flags); + else + ret = get_user_pages_unlocked(start, nr_pages - nr, + pages, gup_flags); /* Have to be a bit careful with return values */ if (nr > 0) { @@ -1897,4 +1910,49 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write, return ret; } +/** + * get_user_pages_fast() - pin user pages in memory + * @start: starting user address + * @nr_pages: number of pages from start to pin + * @write: whether pages will be written to + * @pages: array that receives pointers to the pages pinned. + * Should be at least nr_pages long. + * + * Attempt to pin user pages in memory without taking mm->mmap_sem. + * If not successful, it will fall back to taking the lock and + * calling get_user_pages(). + * + * Returns number of pages pinned. This may be fewer than the number + * requested. If nr_pages is 0 or negative, returns 0. If no pages + * were pinned, returns -errno. + */ +int get_user_pages_fast(unsigned long start, int nr_pages, int write, + struct page **pages) +{ + return __get_user_pages_fast_flags(start, nr_pages, + write ? FOLL_WRITE : 0, + pages); +} + +#ifdef CONFIG_FS_DAX +/** + * get_user_pages_fast_longterm() - pin user pages in memory + * + * Exactly the same semantics as get_user_pages_fast() except fails mappings + * device mapped pages (such as DAX pages) which then fall back to checking for + * FS DAX pages with get_user_pages_longterm(). + */ +int get_user_pages_fast_longterm(unsigned long start, int nr_pages, bool write, + struct page **pages) +{ + unsigned int gup_flags = FOLL_LONGTERM; + + if (write) + gup_flags |= FOLL_WRITE; + + return __get_user_pages_fast_flags(start, nr_pages, gup_flags, pages); +} +EXPORT_SYMBOL(get_user_pages_fast_longterm); +#endif /* CONFIG_FS_DAX */ + #endif /* CONFIG_HAVE_GENERIC_GUP */ -- 2.20.1