2018-08-28 15:01:12

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 00/10] Push the vm_fault_t conversion further

This patch series reaps some of the benefits of the vm_fault_t work that
Souptick has been diligently plugging away at by converting insert_pfn()
to return a vm_fault_t.

Eventually, we'll be able to do the same thing to insert_page(),
but there's more work to be done converting all current callers of
vm_insert_page() to vmf_insert_page(), and this patch series provides
a nice clean-up.

Nicolas, I'd like your reviewed-by on patch 1 please.

Matthew Wilcox (10):
cramfs: Convert to use vmf_insert_mixed
mm: Remove vm_insert_mixed
mm: Introduce vmf_insert_pfn_prot
x86: Convert vdso to use vm_fault_t
mm: Make vm_insert_pfn_prot static
mm: Remove references to vm_insert_pfn
mm: Remove vm_insert_pfn
mm: Inline vm_insert_pfn_prot into caller
mm: Convert __vm_insert_mixed to vm_fault_t
mm: Convert insert_pfn to vm_fault_t

Documentation/x86/pat.txt | 4 +-
arch/x86/entry/vdso/vma.c | 24 +++----
fs/cramfs/inode.c | 9 ++-
include/asm-generic/pgtable.h | 4 +-
include/linux/hmm.h | 2 +-
include/linux/mm.h | 32 +--------
mm/memory.c | 122 +++++++++++++++++-----------------
7 files changed, 84 insertions(+), 113 deletions(-)

--
2.18.0



2018-08-28 14:59:15

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 03/10] mm: Introduce vmf_insert_pfn_prot

Like vm_insert_pfn_prot(), but returns a vm_fault_t instead of an errno.
Also unexport vm_insert_pfn_prot as it has no modular users.

Signed-off-by: Matthew Wilcox <[email protected]>
---
include/linux/mm.h | 2 ++
mm/memory.c | 47 ++++++++++++++++++++++++++++++----------------
2 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ce1b24376d72..e8bc1a16d44c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2482,6 +2482,8 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot);
+vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
+ unsigned long pfn, pgprot_t pgprot);
vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
pfn_t pfn);
vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma,
diff --git a/mm/memory.c b/mm/memory.c
index f0b123a94426..8c116c0f64d8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1843,21 +1843,6 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_pfn);

-/**
- * vm_insert_pfn_prot - insert single pfn into user vma with specified pgprot
- * @vma: user vma to map to
- * @addr: target user address of this page
- * @pfn: source kernel pfn
- * @pgprot: pgprot flags for the inserted page
- *
- * This is exactly like vm_insert_pfn, except that it allows drivers to
- * to override pgprot on a per-page basis.
- *
- * This only makes sense for IO mappings, and it makes no sense for
- * cow mappings. In general, using multiple vmas is preferable;
- * vm_insert_pfn_prot should only be used if using multiple VMAs is
- * impractical.
- */
int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot)
{
@@ -1887,7 +1872,37 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,

return ret;
}
-EXPORT_SYMBOL(vm_insert_pfn_prot);
+
+/**
+ * vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pfn: source kernel pfn
+ * @pgprot: pgprot flags for the inserted page
+ *
+ * This is exactly like vmf_insert_pfn(), except that it allows drivers to
+ * to override pgprot on a per-page basis.
+ *
+ * This only makes sense for IO mappings, and it makes no sense for
+ * COW mappings. In general, using multiple vmas is preferable;
+ * vm_insert_pfn_prot should only be used if using multiple VMAs is
+ * impractical.
+ *
+ * Return: vm_fault_t value.
+ */
+vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
+ unsigned long pfn, pgprot_t pgprot)
+{
+ int err = vm_insert_pfn_prot(vma, addr, pfn, pgprot);
+
+ if (err == -ENOMEM)
+ return VM_FAULT_OOM;
+ if (err < 0 && err != -EBUSY)
+ return VM_FAULT_SIGBUS;
+
+ return VM_FAULT_NOPAGE;
+}
+EXPORT_SYMBOL(vmf_insert_pfn_prot);

static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
{
--
2.18.0


2018-08-28 14:59:18

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 10/10] mm: Convert insert_pfn to vm_fault_t

All callers convert its errno into a vm_fault_t, so convert it to
return a vm_fault_t directly.

Signed-off-by: Matthew Wilcox <[email protected]>
---
mm/memory.c | 24 +++++-------------------
1 file changed, 5 insertions(+), 19 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 9fef202b4ea7..368b65390762 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1767,19 +1767,16 @@ int vm_insert_page(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vm_insert_page);

-static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
+static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
pfn_t pfn, pgprot_t prot, bool mkwrite)
{
struct mm_struct *mm = vma->vm_mm;
- int retval;
pte_t *pte, entry;
spinlock_t *ptl;

- retval = -ENOMEM;
pte = get_locked_pte(mm, addr, &ptl);
if (!pte)
- goto out;
- retval = -EBUSY;
+ return VM_FAULT_OOM;
if (!pte_none(*pte)) {
if (mkwrite) {
/*
@@ -1812,11 +1809,9 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
set_pte_at(mm, addr, pte, entry);
update_mmu_cache(vma, addr, pte); /* XXX: why not for insert_page? */

- retval = 0;
out_unlock:
pte_unmap_unlock(pte, ptl);
-out:
- return retval;
+ return VM_FAULT_NOPAGE;
}

/**
@@ -1840,8 +1835,6 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot)
{
- int err;
-
/*
* Technically, architectures with pte_special can avoid all these
* restrictions (same for remap_pfn_range). However we would like
@@ -1862,15 +1855,8 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,

track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));

- err = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot,
+ return insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot,
false);
-
- if (err == -ENOMEM)
- return VM_FAULT_OOM;
- if (err < 0 && err != -EBUSY)
- return VM_FAULT_SIGBUS;
-
- return VM_FAULT_NOPAGE;
}
EXPORT_SYMBOL(vmf_insert_pfn_prot);

@@ -1950,7 +1936,7 @@ static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma,
page = pfn_to_page(pfn_t_to_pfn(pfn));
err = insert_page(vma, addr, page, pgprot);
} else {
- err = insert_pfn(vma, addr, pfn, pgprot, mkwrite);
+ return insert_pfn(vma, addr, pfn, pgprot, mkwrite);
}

if (err == -ENOMEM)
--
2.18.0


2018-08-28 14:59:26

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 08/10] mm: Inline vm_insert_pfn_prot into caller

vm_insert_pfn_prot() is only called from vmf_insert_pfn_prot(),
so inline it and convert some of the errnos into vm_fault codes earlier.

Signed-off-by: Matthew Wilcox <[email protected]>
---
mm/memory.c | 55 +++++++++++++++++++++++------------------------------
1 file changed, 24 insertions(+), 31 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index d5ccbadd81c1..9e97926fee19 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1819,36 +1819,6 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
return retval;
}

-static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn, pgprot_t pgprot)
-{
- int ret;
- /*
- * Technically, architectures with pte_special can avoid all these
- * restrictions (same for remap_pfn_range). However we would like
- * consistency in testing and feature parity among all, so we should
- * try to keep these invariants in place for everybody.
- */
- BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
- BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
- (VM_PFNMAP|VM_MIXEDMAP));
- BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
- BUG_ON((vma->vm_flags & VM_MIXEDMAP) && pfn_valid(pfn));
-
- if (addr < vma->vm_start || addr >= vma->vm_end)
- return -EFAULT;
-
- if (!pfn_modify_allowed(pfn, pgprot))
- return -EACCES;
-
- track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
-
- ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot,
- false);
-
- return ret;
-}
-
/**
* vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot
* @vma: user vma to map to
@@ -1870,7 +1840,30 @@ static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot)
{
- int err = vm_insert_pfn_prot(vma, addr, pfn, pgprot);
+ int err;
+
+ /*
+ * Technically, architectures with pte_special can avoid all these
+ * restrictions (same for remap_pfn_range). However we would like
+ * consistency in testing and feature parity among all, so we should
+ * try to keep these invariants in place for everybody.
+ */
+ BUG_ON(!(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)));
+ BUG_ON((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) ==
+ (VM_PFNMAP|VM_MIXEDMAP));
+ BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
+ BUG_ON((vma->vm_flags & VM_MIXEDMAP) && pfn_valid(pfn));
+
+ if (addr < vma->vm_start || addr >= vma->vm_end)
+ return VM_FAULT_SIGBUS;
+
+ if (!pfn_modify_allowed(pfn, pgprot))
+ return VM_FAULT_SIGBUS;
+
+ track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
+
+ err = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot,
+ false);

if (err == -ENOMEM)
return VM_FAULT_OOM;
--
2.18.0


2018-08-28 14:59:42

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 09/10] mm: Convert __vm_insert_mixed to vm_fault_t

Both of its callers currently convert its errno return into a
vm_fault_t, so move the conversion into __vm_insert_mixed.

Signed-off-by: Matthew Wilcox <[email protected]>
---
mm/memory.c | 36 +++++++++++++++---------------------
1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 9e97926fee19..9fef202b4ea7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1915,20 +1915,21 @@ static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
return false;
}

-static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
- pfn_t pfn, bool mkwrite)
+static vm_fault_t __vm_insert_mixed(struct vm_area_struct *vma,
+ unsigned long addr, pfn_t pfn, bool mkwrite)
{
pgprot_t pgprot = vma->vm_page_prot;
+ int err;

BUG_ON(!vm_mixed_ok(vma, pfn));

if (addr < vma->vm_start || addr >= vma->vm_end)
- return -EFAULT;
+ return VM_FAULT_SIGBUS;

track_pfn_insert(vma, &pgprot, pfn);

if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot))
- return -EACCES;
+ return VM_FAULT_SIGBUS;

/*
* If we don't have pte special, then we have to use the pfn_valid()
@@ -1947,15 +1948,10 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
* result in pfn_t_has_page() == false.
*/
page = pfn_to_page(pfn_t_to_pfn(pfn));
- return insert_page(vma, addr, page, pgprot);
+ err = insert_page(vma, addr, page, pgprot);
+ } else {
+ err = insert_pfn(vma, addr, pfn, pgprot, mkwrite);
}
- return insert_pfn(vma, addr, pfn, pgprot, mkwrite);
-}
-
-vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
- pfn_t pfn)
-{
- int err = __vm_insert_mixed(vma, addr, pfn, false);

if (err == -ENOMEM)
return VM_FAULT_OOM;
@@ -1964,6 +1960,12 @@ vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,

return VM_FAULT_NOPAGE;
}
+
+vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
+ pfn_t pfn)
+{
+ return __vm_insert_mixed(vma, addr, pfn, false);
+}
EXPORT_SYMBOL(vmf_insert_mixed);

/*
@@ -1971,18 +1973,10 @@ EXPORT_SYMBOL(vmf_insert_mixed);
* different entry in the mean time, we treat that as success as we assume
* the same entry was actually inserted.
*/
-
vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma,
unsigned long addr, pfn_t pfn)
{
- int err;
-
- err = __vm_insert_mixed(vma, addr, pfn, true);
- if (err == -ENOMEM)
- return VM_FAULT_OOM;
- if (err < 0 && err != -EBUSY)
- return VM_FAULT_SIGBUS;
- return VM_FAULT_NOPAGE;
+ return __vm_insert_mixed(vma, addr, pfn, true);
}
EXPORT_SYMBOL(vmf_insert_mixed_mkwrite);

--
2.18.0


2018-08-28 15:00:11

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 06/10] mm: Remove references to vm_insert_pfn

Documentation and comments.

Signed-off-by: Matthew Wilcox <[email protected]>
---
Documentation/x86/pat.txt | 4 ++--
include/asm-generic/pgtable.h | 4 ++--
include/linux/hmm.h | 2 +-
3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index 2a4ee6302122..481d8d8536ac 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -90,12 +90,12 @@ pci proc | -- | -- | WC |
Advanced APIs for drivers
-------------------------
A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
-vm_insert_pfn
+vmf_insert_pfn

Drivers wanting to export some pages to userspace do it by using mmap
interface and a combination of
1) pgprot_noncached()
-2) io_remap_pfn_range() or remap_pfn_range() or vm_insert_pfn()
+2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()

With PAT support, a new API pgprot_writecombine is being added. So, drivers can
continue to use the above sequence, with either pgprot_noncached() or
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 88ebc6102c7c..5657a20e0c59 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -757,7 +757,7 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
/*
* Interfaces that can be used by architecture code to keep track of
* memory type of pfn mappings specified by the remap_pfn_range,
- * vm_insert_pfn.
+ * vmf_insert_pfn.
*/

/*
@@ -773,7 +773,7 @@ static inline int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,

/*
* track_pfn_insert is called when a _new_ single pfn is established
- * by vm_insert_pfn().
+ * by vmf_insert_pfn().
*/
static inline void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
pfn_t pfn)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 4c92e3ba3e16..dde947083d4e 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -107,7 +107,7 @@ enum hmm_pfn_flag_e {
* HMM_PFN_ERROR: corresponding CPU page table entry points to poisoned memory
* HMM_PFN_NONE: corresponding CPU page table entry is pte_none()
* HMM_PFN_SPECIAL: corresponding CPU page table entry is special; i.e., the
- * result of vm_insert_pfn() or vm_insert_page(). Therefore, it should not
+ * result of vmf_insert_pfn() or vm_insert_page(). Therefore, it should not
* be mirrored by a device, because the entry will never have HMM_PFN_VALID
* set and the pfn value is undefined.
*
--
2.18.0


2018-08-28 15:00:12

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 07/10] mm: Remove vm_insert_pfn

All callers are now converted to vmf_insert_pfn so convert
vmf_insert_pfn() from being a compatibility wrapper around vm_insert_pfn()
to being a compatibility wrapper around vmf_insert_pfn_prot().

Signed-off-by: Matthew Wilcox <[email protected]>
---
include/linux/mm.h | 15 +------------
mm/memory.c | 54 +++++++++++++++++++++++++---------------------
2 files changed, 30 insertions(+), 39 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1552c67c835e..bd5e2469b637 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2478,7 +2478,7 @@ struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
-int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
+vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot);
@@ -2501,19 +2501,6 @@ static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma,
return VM_FAULT_NOPAGE;
}

-static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma,
- unsigned long addr, unsigned long pfn)
-{
- int err = vm_insert_pfn(vma, addr, pfn);
-
- if (err == -ENOMEM)
- return VM_FAULT_OOM;
- if (err < 0 && err != -EBUSY)
- return VM_FAULT_SIGBUS;
-
- return VM_FAULT_NOPAGE;
-}
-
static inline vm_fault_t vmf_error(int err)
{
if (err == -ENOMEM)
diff --git a/mm/memory.c b/mm/memory.c
index 8392a104a36d..d5ccbadd81c1 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1849,30 +1849,6 @@ static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
return ret;
}

-/**
- * vm_insert_pfn - insert single pfn into user vma
- * @vma: user vma to map to
- * @addr: target user address of this page
- * @pfn: source kernel pfn
- *
- * Similar to vm_insert_page, this allows drivers to insert individual pages
- * they've allocated into a user vma. Same comments apply.
- *
- * This function should only be called from a vm_ops->fault handler, and
- * in that case the handler should return NULL.
- *
- * vma cannot be a COW mapping.
- *
- * As this is called only for pages that do not currently exist, we
- * do not need to flush old virtual caches or the TLB.
- */
-int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn)
-{
- return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot);
-}
-EXPORT_SYMBOL(vm_insert_pfn);
-
/**
* vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot
* @vma: user vma to map to
@@ -1885,9 +1861,10 @@ EXPORT_SYMBOL(vm_insert_pfn);
*
* This only makes sense for IO mappings, and it makes no sense for
* COW mappings. In general, using multiple vmas is preferable;
- * vm_insert_pfn_prot should only be used if using multiple VMAs is
+ * vmf_insert_pfn_prot should only be used if using multiple VMAs is
* impractical.
*
+ * Context: Process context. May allocate using %GFP_KERNEL.
* Return: vm_fault_t value.
*/
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
@@ -1904,6 +1881,33 @@ vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(vmf_insert_pfn_prot);

+/**
+ * vmf_insert_pfn - insert single pfn into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pfn: source kernel pfn
+ *
+ * Similar to vm_insert_page, this allows drivers to insert individual pages
+ * they've allocated into a user vma. Same comments apply.
+ *
+ * This function should only be called from a vm_ops->fault handler, and
+ * in that case the handler should return the result of this function.
+ *
+ * vma cannot be a COW mapping.
+ *
+ * As this is called only for pages that do not currently exist, we
+ * do not need to flush old virtual caches or the TLB.
+ *
+ * Context: Process context. May allocate using %GFP_KERNEL.
+ * Return: vm_fault_t value.
+ */
+vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
+ unsigned long pfn)
+{
+ return vmf_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot);
+}
+EXPORT_SYMBOL(vmf_insert_pfn);
+
static bool vm_mixed_ok(struct vm_area_struct *vma, pfn_t pfn)
{
/* these checks mirror the abort conditions in vm_normal_page */
--
2.18.0


2018-08-28 15:00:14

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 02/10] mm: Remove vm_insert_mixed

All callers are now converted to vmf_insert_mixed() so convert
vmf_insert_mixed() from being a compatibility wrapper into the real
function.

Signed-off-by: Matthew Wilcox <[email protected]>
---
include/linux/mm.h | 15 +--------------
mm/memory.c | 14 ++++++++++----
2 files changed, 11 insertions(+), 18 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a61ebe8ad4ca..ce1b24376d72 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2482,7 +2482,7 @@ int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot);
-int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
+vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
pfn_t pfn);
vm_fault_t vmf_insert_mixed_mkwrite(struct vm_area_struct *vma,
unsigned long addr, pfn_t pfn);
@@ -2501,19 +2501,6 @@ static inline vm_fault_t vmf_insert_page(struct vm_area_struct *vma,
return VM_FAULT_NOPAGE;
}

-static inline vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma,
- unsigned long addr, pfn_t pfn)
-{
- int err = vm_insert_mixed(vma, addr, pfn);
-
- if (err == -ENOMEM)
- return VM_FAULT_OOM;
- if (err < 0 && err != -EBUSY)
- return VM_FAULT_SIGBUS;
-
- return VM_FAULT_NOPAGE;
-}
-
static inline vm_fault_t vmf_insert_pfn(struct vm_area_struct *vma,
unsigned long addr, unsigned long pfn)
{
diff --git a/mm/memory.c b/mm/memory.c
index c467102a5cbc..f0b123a94426 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1940,13 +1940,19 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
return insert_pfn(vma, addr, pfn, pgprot, mkwrite);
}

-int vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
- pfn_t pfn)
+vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
+ pfn_t pfn)
{
- return __vm_insert_mixed(vma, addr, pfn, false);
+ int err = __vm_insert_mixed(vma, addr, pfn, false);

+ if (err == -ENOMEM)
+ return VM_FAULT_OOM;
+ if (err < 0 && err != -EBUSY)
+ return VM_FAULT_SIGBUS;
+
+ return VM_FAULT_NOPAGE;
}
-EXPORT_SYMBOL(vm_insert_mixed);
+EXPORT_SYMBOL(vmf_insert_mixed);

/*
* If the insertion of PTE failed because someone else already added a
--
2.18.0


2018-08-28 15:00:17

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 05/10] mm: Make vm_insert_pfn_prot static

Now this is no longer used outside mm/memory.c, make it static.

Signed-off-by: Matthew Wilcox <[email protected]>
---
include/linux/mm.h | 2 --
mm/memory.c | 50 +++++++++++++++++++++++-----------------------
2 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index e8bc1a16d44c..1552c67c835e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2480,8 +2480,6 @@ int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);
int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn);
-int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn, pgprot_t pgprot);
vm_fault_t vmf_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot);
vm_fault_t vmf_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
diff --git a/mm/memory.c b/mm/memory.c
index 8c116c0f64d8..8392a104a36d 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1819,31 +1819,7 @@ static int insert_pfn(struct vm_area_struct *vma, unsigned long addr,
return retval;
}

-/**
- * vm_insert_pfn - insert single pfn into user vma
- * @vma: user vma to map to
- * @addr: target user address of this page
- * @pfn: source kernel pfn
- *
- * Similar to vm_insert_page, this allows drivers to insert individual pages
- * they've allocated into a user vma. Same comments apply.
- *
- * This function should only be called from a vm_ops->fault handler, and
- * in that case the handler should return NULL.
- *
- * vma cannot be a COW mapping.
- *
- * As this is called only for pages that do not currently exist, we
- * do not need to flush old virtual caches or the TLB.
- */
-int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn)
-{
- return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot);
-}
-EXPORT_SYMBOL(vm_insert_pfn);
-
-int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
+static int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
unsigned long pfn, pgprot_t pgprot)
{
int ret;
@@ -1873,6 +1849,30 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
return ret;
}

+/**
+ * vm_insert_pfn - insert single pfn into user vma
+ * @vma: user vma to map to
+ * @addr: target user address of this page
+ * @pfn: source kernel pfn
+ *
+ * Similar to vm_insert_page, this allows drivers to insert individual pages
+ * they've allocated into a user vma. Same comments apply.
+ *
+ * This function should only be called from a vm_ops->fault handler, and
+ * in that case the handler should return NULL.
+ *
+ * vma cannot be a COW mapping.
+ *
+ * As this is called only for pages that do not currently exist, we
+ * do not need to flush old virtual caches or the TLB.
+ */
+int vm_insert_pfn(struct vm_area_struct *vma, unsigned long addr,
+ unsigned long pfn)
+{
+ return vm_insert_pfn_prot(vma, addr, pfn, vma->vm_page_prot);
+}
+EXPORT_SYMBOL(vm_insert_pfn);
+
/**
* vmf_insert_pfn_prot - insert single pfn into user vma with specified pgprot
* @vma: user vma to map to
--
2.18.0


2018-08-28 15:00:26

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 04/10] x86: Convert vdso to use vm_fault_t

Return vm_fault_t codes directly from the appropriate mm routines instead
of converting from errnos ourselves. Fixes a minor bug where we'd return
SIGBUS instead of the correct OOM code if we ran out of memory allocating
page tables.

Signed-off-by: Matthew Wilcox <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
---
arch/x86/entry/vdso/vma.c | 24 +++++++++---------------
1 file changed, 9 insertions(+), 15 deletions(-)

diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 5b8b556dbb12..d1daa53215a4 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -39,7 +39,7 @@ void __init init_vdso_image(const struct vdso_image *image)

struct linux_binprm;

-static int vdso_fault(const struct vm_special_mapping *sm,
+static vm_fault_t vdso_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
const struct vdso_image *image = vma->vm_mm->context.vdso_image;
@@ -84,12 +84,11 @@ static int vdso_mremap(const struct vm_special_mapping *sm,
return 0;
}

-static int vvar_fault(const struct vm_special_mapping *sm,
+static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct vm_area_struct *vma, struct vm_fault *vmf)
{
const struct vdso_image *image = vma->vm_mm->context.vdso_image;
long sym_offset;
- int ret = -EFAULT;

if (!image)
return VM_FAULT_SIGBUS;
@@ -108,29 +107,24 @@ static int vvar_fault(const struct vm_special_mapping *sm,
return VM_FAULT_SIGBUS;

if (sym_offset == image->sym_vvar_page) {
- ret = vm_insert_pfn(vma, vmf->address,
- __pa_symbol(&__vvar_page) >> PAGE_SHIFT);
+ return vmf_insert_pfn(vma, vmf->address,
+ __pa_symbol(&__vvar_page) >> PAGE_SHIFT);
} else if (sym_offset == image->sym_pvclock_page) {
struct pvclock_vsyscall_time_info *pvti =
pvclock_get_pvti_cpu0_va();
if (pvti && vclock_was_used(VCLOCK_PVCLOCK)) {
- ret = vm_insert_pfn_prot(
- vma,
- vmf->address,
- __pa(pvti) >> PAGE_SHIFT,
- pgprot_decrypted(vma->vm_page_prot));
+ return vmf_insert_pfn_prot(vma, vmf->address,
+ __pa(pvti) >> PAGE_SHIFT,
+ pgprot_decrypted(vma->vm_page_prot));
}
} else if (sym_offset == image->sym_hvclock_page) {
struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();

if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK))
- ret = vm_insert_pfn(vma, vmf->address,
- vmalloc_to_pfn(tsc_pg));
+ return vmf_insert_pfn(vma, vmf->address,
+ vmalloc_to_pfn(tsc_pg));
}

- if (ret == 0 || ret == -EBUSY)
- return VM_FAULT_NOPAGE;
-
return VM_FAULT_SIGBUS;
}

--
2.18.0


2018-08-28 15:00:35

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH 01/10] cramfs: Convert to use vmf_insert_mixed

cramfs is the only remaining user of vm_insert_mixed; convert it.

Signed-off-by: Matthew Wilcox <[email protected]>
---
fs/cramfs/inode.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index f408994fc632..b72449c19cd1 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -417,10 +417,15 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
*/
int i;
vma->vm_flags |= VM_MIXEDMAP;
- for (i = 0; i < pages && !ret; i++) {
+ for (i = 0; i < pages; i++) {
+ vm_fault_t vmf;
unsigned long off = i * PAGE_SIZE;
pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV);
- ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
+ vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
+ if (vmf & VM_FAULT_ERROR) {
+ pages = i;
+ break;
+ }
}
}

--
2.18.0


2018-08-28 17:51:00

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/10] cramfs: Convert to use vmf_insert_mixed

On Tue, 28 Aug 2018, Matthew Wilcox wrote:

> cramfs is the only remaining user of vm_insert_mixed; convert it.
>
> Signed-off-by: Matthew Wilcox <[email protected]>
> ---
> fs/cramfs/inode.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
> index f408994fc632..b72449c19cd1 100644
> --- a/fs/cramfs/inode.c
> +++ b/fs/cramfs/inode.c
> @@ -417,10 +417,15 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
> */
> int i;
> vma->vm_flags |= VM_MIXEDMAP;
> - for (i = 0; i < pages && !ret; i++) {
> + for (i = 0; i < pages; i++) {
> + vm_fault_t vmf;
> unsigned long off = i * PAGE_SIZE;
> pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV);
> - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> + if (vmf & VM_FAULT_ERROR) {
> + pages = i;
> + break;
> + }

I'd suggest this to properly deal with errers instead:

diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index f408994fc6..0c35e62f10 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -418,9 +418,12 @@ static int cramfs_physmem_mmap(struct file *file, struct vm_area_struct *vma)
int i;
vma->vm_flags |= VM_MIXEDMAP;
for (i = 0; i < pages && !ret; i++) {
+ vm_fault_t vmf;
unsigned long off = i * PAGE_SIZE;
pfn_t pfn = phys_to_pfn_t(address + off, PFN_DEV);
- ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
+ vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
+ if (vmf & VM_FAULT_ERROR)
+ ret = vm_fault_to_errno(vmf, 0);
}
}


Nicolas

2018-08-28 22:06:12

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 01/10] cramfs: Convert to use vmf_insert_mixed

On Tue, Aug 28, 2018 at 01:49:25PM -0400, Nicolas Pitre wrote:
> On Tue, 28 Aug 2018, Matthew Wilcox wrote:
> > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> > + if (vmf & VM_FAULT_ERROR) {
> > + pages = i;
> > + break;
> > + }
>
> I'd suggest this to properly deal with errers instead:
>
> - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> + if (vmf & VM_FAULT_ERROR)
> + ret = vm_fault_to_errno(vmf, 0);

By my reading of this function, the intent is actually to return 0
here and allow demand paging to work. Of course, I've spent all of
twenty minutes staring at this function, so I defer to the maintainer.
I think you'd need to be running a make-memory-allocations-fail fuzzer
to hit this, so it's likely never been tested.


2018-08-28 23:53:55

by Nicolas Pitre

[permalink] [raw]
Subject: Re: [PATCH 01/10] cramfs: Convert to use vmf_insert_mixed

On Tue, 28 Aug 2018, Matthew Wilcox wrote:

> On Tue, Aug 28, 2018 at 01:49:25PM -0400, Nicolas Pitre wrote:
> > On Tue, 28 Aug 2018, Matthew Wilcox wrote:
> > > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> > > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> > > + if (vmf & VM_FAULT_ERROR) {
> > > + pages = i;
> > > + break;
> > > + }
> >
> > I'd suggest this to properly deal with errers instead:
> >
> > - ret = vm_insert_mixed(vma, vma->vm_start + off, pfn);
> > + vmf = vmf_insert_mixed(vma, vma->vm_start + off, pfn);
> > + if (vmf & VM_FAULT_ERROR)
> > + ret = vm_fault_to_errno(vmf, 0);
>
> By my reading of this function, the intent is actually to return 0
> here and allow demand paging to work. Of course, I've spent all of
> twenty minutes staring at this function, so I defer to the maintainer.

Demand paging is used when the filesystem layout isn't amenable to a
direct mapping. It is not a fallback for when we're OOM or some other
internal errors which ought to be reported immediately.

> I think you'd need to be running a make-memory-allocations-fail fuzzer
> to hit this, so it's likely never been tested.

Well, it has been tested sort of, e.g. when vm_insert_mixed() returned
an error due to misaligned addresses during development. Normally,
vm_insert_mixed() and vmf_insert_mixed() should always succeed, and if
they don't we certainly don't want to ignore it.


Nicolas