This is version 2 of the patch series with a few non-functional changes.
Changes are described in the individual changelog.
This patch series incorporates a few changes in the get_user_page usage
of sgi-gru.
The main change is the first patch, which is a trivial one line change to
convert put_page to put_user_page to enable tracking of get_user_pages.
The second patch removes an uneccessary ifdef of CONFIG_HUGETLB.
The third patch adds __get_user_pages_fast in atomic_pte_lookup to retrive
a physical user page in an atomic context instead of manually walking up
the page tables like the current code does. This patch should be subject to
more review from the gup people.
drivers/misc/sgi-gru/* builds after this patch series. But I do not have the
hardware to verify these changes.
The first patch implements gup tracking in the current code. This is to be tested
as to check whether gup tracking works properly. Currently, in the upstream kernels
put_user_page simply calls put_page. But that is to change in the future.
Any suggestions as to how to test this code?
The implementation of gup tracking is in:
https://github.com/johnhubbard/linux/tree/gup_dma_core
We could test it by applying the first patch to the above tree and test it.
More details are in the individual changelogs.
Bharath Vedartham (3):
sgi-gru: Convert put_page() to get_user_page*()
sgi-gru: Remove CONFIG_HUGETLB_PAGE ifdef
sgi-gru: Use __get_user_pages_fast in atomic_pte_lookup
drivers/misc/sgi-gru/grufault.c | 73 ++++++++++++++---------------------------
1 file changed, 24 insertions(+), 49 deletions(-)
--
2.7.4
is_vm_hugetlb_page has checks for whether CONFIG_HUGETLB_PAGE is defined
or not. If CONFIG_HUGETLB_PAGE is not defined is_vm_hugetlb_page will
always return false. There is no need to have an uneccessary
CONFIG_HUGETLB_PAGE check in the code.
Cc: Ira Weiny <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Dimitri Sivanich <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Bharath Vedartham <[email protected]>
---
Changes since v2
- Added an 'unlikely' if statement as suggested by William
Kucharski. This is because of the fact that most systems
using this driver won't have CONFIG_HUGE_PAGE enabled and we
optimize the branch with an unlikely.
Signed-off-by: Bharath Vedartham <[email protected]>
---
drivers/misc/sgi-gru/grufault.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
index 61b3447..bce47af 100644
--- a/drivers/misc/sgi-gru/grufault.c
+++ b/drivers/misc/sgi-gru/grufault.c
@@ -180,11 +180,11 @@ static int non_atomic_pte_lookup(struct vm_area_struct *vma,
{
struct page *page;
-#ifdef CONFIG_HUGETLB_PAGE
- *pageshift = is_vm_hugetlb_page(vma) ? HPAGE_SHIFT : PAGE_SHIFT;
-#else
- *pageshift = PAGE_SHIFT;
-#endif
+ if (unlikely(is_vm_hugetlb_page(vma)))
+ *pageshift = HPAGE_SHIFT;
+ else
+ *pageshift = PAGE_SHIFT;
+
if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
return -EFAULT;
*paddr = page_to_phys(page);
@@ -238,11 +238,12 @@ static int atomic_pte_lookup(struct vm_area_struct *vma, unsigned long vaddr,
return 1;
*paddr = pte_pfn(pte) << PAGE_SHIFT;
-#ifdef CONFIG_HUGETLB_PAGE
- *pageshift = is_vm_hugetlb_page(vma) ? HPAGE_SHIFT : PAGE_SHIFT;
-#else
- *pageshift = PAGE_SHIFT;
-#endif
+
+ if (unlikely(is_vm_hugetlb_page(vma)))
+ *pageshift = HPAGE_SHIFT;
+ else
+ *pageshift = PAGE_SHIFT;
+
return 0;
err:
--
2.7.4
*pte_lookup functions get the physical address for a given virtual
address by getting a physical page using gup and use page_to_phys to get
the physical address.
Currently, atomic_pte_lookup manually walks the page tables. If this
function fails to get a physical page, it will fall back too
non_atomic_pte_lookup to get a physical page which uses the slow gup
path to get the physical page.
Instead of manually walking the page tables use __get_user_pages_fast
which does the same thing and it does not fall back to the slow gup
path.
Also, the function atomic_pte_lookup's return value has been changed to boolean.
The callsites have been appropriately modified.
This is largely inspired from kvm code. kvm uses __get_user_pages_fast
in hva_to_pfn_fast function which can run in an atomic context.
Cc: Ira Weiny <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Dimitri Sivanich <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Bharath Vedartham <[email protected]>
---
Changes since v2
- Modified the return value of atomic_pte_lookup
to use booleans rather than numeric values.
This was suggested by John Hubbard.
---
drivers/misc/sgi-gru/grufault.c | 56 +++++++++++------------------------------
1 file changed, 15 insertions(+), 41 deletions(-)
diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
index bce47af..da2d2cc 100644
--- a/drivers/misc/sgi-gru/grufault.c
+++ b/drivers/misc/sgi-gru/grufault.c
@@ -193,9 +193,11 @@ static int non_atomic_pte_lookup(struct vm_area_struct *vma,
}
/*
- * atomic_pte_lookup
+ * atomic_pte_lookup() - Convert a user virtual address
+ * to a physical address.
+ * @Return: true for success, false for failure. Failure means that
+ * the page could not be pinned via gup fast.
*
- * Convert a user virtual address to a physical address
* Only supports Intel large pages (2MB only) on x86_64.
* ZZZ - hugepage support is incomplete
*
@@ -205,49 +207,20 @@ static int non_atomic_pte_lookup(struct vm_area_struct *vma,
static int atomic_pte_lookup(struct vm_area_struct *vma, unsigned long vaddr,
int write, unsigned long *paddr, int *pageshift)
{
- pgd_t *pgdp;
- p4d_t *p4dp;
- pud_t *pudp;
- pmd_t *pmdp;
- pte_t pte;
-
- pgdp = pgd_offset(vma->vm_mm, vaddr);
- if (unlikely(pgd_none(*pgdp)))
- goto err;
-
- p4dp = p4d_offset(pgdp, vaddr);
- if (unlikely(p4d_none(*p4dp)))
- goto err;
-
- pudp = pud_offset(p4dp, vaddr);
- if (unlikely(pud_none(*pudp)))
- goto err;
-
- pmdp = pmd_offset(pudp, vaddr);
- if (unlikely(pmd_none(*pmdp)))
- goto err;
-#ifdef CONFIG_X86_64
- if (unlikely(pmd_large(*pmdp)))
- pte = *(pte_t *) pmdp;
- else
-#endif
- pte = *pte_offset_kernel(pmdp, vaddr);
-
- if (unlikely(!pte_present(pte) ||
- (write && (!pte_write(pte) || !pte_dirty(pte)))))
- return 1;
-
- *paddr = pte_pfn(pte) << PAGE_SHIFT;
+ struct page *page;
if (unlikely(is_vm_hugetlb_page(vma)))
*pageshift = HPAGE_SHIFT;
else
*pageshift = PAGE_SHIFT;
- return 0;
+ if (!__get_user_pages_fast(vaddr, 1, write, &page))
+ return false;
-err:
- return 1;
+ *paddr = page_to_phys(page);
+ put_user_page(page);
+
+ return true;
}
static int gru_vtop(struct gru_thread_state *gts, unsigned long vaddr,
@@ -256,7 +229,8 @@ static int gru_vtop(struct gru_thread_state *gts, unsigned long vaddr,
struct mm_struct *mm = gts->ts_mm;
struct vm_area_struct *vma;
unsigned long paddr;
- int ret, ps;
+ int ps;
+ bool success;
vma = find_vma(mm, vaddr);
if (!vma)
@@ -267,8 +241,8 @@ static int gru_vtop(struct gru_thread_state *gts, unsigned long vaddr,
* context.
*/
rmb(); /* Must/check ms_range_active before loading PTEs */
- ret = atomic_pte_lookup(vma, vaddr, write, &paddr, &ps);
- if (ret) {
+ success = atomic_pte_lookup(vma, vaddr, write, &paddr, &ps);
+ if (!success) {
if (atomic)
goto upm;
if (non_atomic_pte_lookup(vma, vaddr, write, &paddr, &ps))
--
2.7.4
For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().
This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").
Cc: Ira Weiny <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Dimitri Sivanich <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: William Kucharski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Signed-off-by: Bharath Vedartham <[email protected]>
---
drivers/misc/sgi-gru/grufault.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/sgi-gru/grufault.c b/drivers/misc/sgi-gru/grufault.c
index 4b713a8..61b3447 100644
--- a/drivers/misc/sgi-gru/grufault.c
+++ b/drivers/misc/sgi-gru/grufault.c
@@ -188,7 +188,7 @@ static int non_atomic_pte_lookup(struct vm_area_struct *vma,
if (get_user_pages(vaddr, 1, write ? FOLL_WRITE : 0, &page, NULL) <= 0)
return -EFAULT;
*paddr = page_to_phys(page);
- put_page(page);
+ put_user_page(page);
return 0;
}
--
2.7.4
I think the atomic_pte_lookup / non_atomic_pte_lookup helpers
should simply go away. Most of the setup code is common now and should
be in the caller where it can be shared. Then just do a:
if (atomic) {
__get_user_pages_fast()
} else {
get_user_pages_fast();
}
and we actually have an easy to understand piece of code.
On Wed, Jul 24, 2019 at 09:09:29AM -0700, Christoph Hellwig wrote:
> I think the atomic_pte_lookup / non_atomic_pte_lookup helpers
> should simply go away. Most of the setup code is common now and should
> be in the caller where it can be shared. Then just do a:
>
> if (atomic) {
> __get_user_pages_fast()
> } else {
> get_user_pages_fast();
> }
>
> and we actually have an easy to understand piece of code.
That makes sense. I ll do that and send v3. I ll probably cut down on a
patch and try to fold all the changes into a single patch removing the
*pte_lookup helpers.