2020-10-26 11:00:08

by Daniel Vetter

[permalink] [raw]
Subject: [PATCH v4 05/15] mm/frame-vector: Use FOLL_LONGTERM

This is used by media/videbuf2 for persistent dma mappings, not just
for a single dma operation and then freed again, so needs
FOLL_LONGTERM.

Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
locking issues. Rework the code to pull the pup path out from the
mmap_sem critical section as suggested by Jason.

By relying entirely on the vma checks in pin_user_pages and follow_pfn
(for vm_flags and vma_is_fsdax) we can also streamline the code a lot.

Signed-off-by: Daniel Vetter <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Pawel Osciak <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Kyungmin Park <[email protected]>
Cc: Tomasz Figa <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
--
v2: Streamline the code and further simplify the loop checks (Jason)
---
mm/frame_vector.c | 50 ++++++++++++++---------------------------------
1 file changed, 15 insertions(+), 35 deletions(-)

diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index 10f82d5643b6..d44779e56313 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
struct vm_area_struct *vma;
int ret = 0;
int err;
- int locked;

if (nr_frames == 0)
return 0;
@@ -48,40 +47,25 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,

start = untagged_addr(start);

- mmap_read_lock(mm);
- locked = 1;
- vma = find_vma_intersection(mm, start, start + 1);
- if (!vma) {
- ret = -EFAULT;
- goto out;
- }
-
- /*
- * While get_vaddr_frames() could be used for transient (kernel
- * controlled lifetime) pinning of memory pages all current
- * users establish long term (userspace controlled lifetime)
- * page pinning. Treat get_vaddr_frames() like
- * get_user_pages_longterm() and disallow it for filesystem-dax
- * mappings.
- */
- if (vma_is_fsdax(vma)) {
- ret = -EOPNOTSUPP;
- goto out;
- }
-
- if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
+ ret = pin_user_pages_fast(start, nr_frames,
+ FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
+ (struct page **)(vec->ptrs));
+ if (ret > 0) {
vec->got_ref = true;
vec->is_pfns = false;
- ret = pin_user_pages_locked(start, nr_frames,
- gup_flags, (struct page **)(vec->ptrs), &locked);
- goto out;
+ goto out_unlocked;
}

+ mmap_read_lock(mm);
vec->got_ref = false;
vec->is_pfns = true;
do {
unsigned long *nums = frame_vector_pfns(vec);

+ vma = find_vma_intersection(mm, start, start + 1);
+ if (!vma)
+ break;
+
while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
err = follow_pfn(vma, start, &nums[ret]);
if (err) {
@@ -92,17 +76,13 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
start += PAGE_SIZE;
ret++;
}
- /*
- * We stop if we have enough pages or if VMA doesn't completely
- * cover the tail page.
- */
- if (ret >= nr_frames || start < vma->vm_end)
+ /* Bail out if VMA doesn't completely cover the tail page. */
+ if (start < vma->vm_end)
break;
- vma = find_vma_intersection(mm, start, start + 1);
- } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
+ } while (ret < nr_frames);
out:
- if (locked)
- mmap_read_unlock(mm);
+ mmap_read_unlock(mm);
+out_unlocked:
if (!ret)
ret = -EFAULT;
if (ret > 0)
--
2.28.0


2020-10-27 00:09:14

by Tomasz Figa

[permalink] [raw]
Subject: Re: [PATCH v4 05/15] mm/frame-vector: Use FOLL_LONGTERM

Hi Daniel,

On Mon, Oct 26, 2020 at 11:58:08AM +0100, Daniel Vetter wrote:
> This is used by media/videbuf2 for persistent dma mappings, not just
> for a single dma operation and then freed again, so needs
> FOLL_LONGTERM.
>
> Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> locking issues. Rework the code to pull the pup path out from the
> mmap_sem critical section as suggested by Jason.
>
> By relying entirely on the vma checks in pin_user_pages and follow_pfn
> (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
>
> Signed-off-by: Daniel Vetter <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Pawel Osciak <[email protected]>
> Cc: Marek Szyprowski <[email protected]>
> Cc: Kyungmin Park <[email protected]>
> Cc: Tomasz Figa <[email protected]>
> Cc: Mauro Carvalho Chehab <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: John Hubbard <[email protected]>
> Cc: J?r?me Glisse <[email protected]>
> Cc: Jan Kara <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Daniel Vetter <[email protected]>
> --
> v2: Streamline the code and further simplify the loop checks (Jason)
> ---
> mm/frame_vector.c | 50 ++++++++++++++---------------------------------
> 1 file changed, 15 insertions(+), 35 deletions(-)
>

Thank you for the patch. Please see my comments inline.

> diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> index 10f82d5643b6..d44779e56313 100644
> --- a/mm/frame_vector.c
> +++ b/mm/frame_vector.c
> @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> struct vm_area_struct *vma;
> int ret = 0;
> int err;
> - int locked;
>
> if (nr_frames == 0)
> return 0;
> @@ -48,40 +47,25 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
>
> start = untagged_addr(start);
>
> - mmap_read_lock(mm);
> - locked = 1;
> - vma = find_vma_intersection(mm, start, start + 1);
> - if (!vma) {
> - ret = -EFAULT;
> - goto out;
> - }
> -
> - /*
> - * While get_vaddr_frames() could be used for transient (kernel
> - * controlled lifetime) pinning of memory pages all current
> - * users establish long term (userspace controlled lifetime)
> - * page pinning. Treat get_vaddr_frames() like
> - * get_user_pages_longterm() and disallow it for filesystem-dax
> - * mappings.
> - */
> - if (vma_is_fsdax(vma)) {
> - ret = -EOPNOTSUPP;
> - goto out;
> - }
> -
> - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> + ret = pin_user_pages_fast(start, nr_frames,
> + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> + (struct page **)(vec->ptrs));
> + if (ret > 0) {
> vec->got_ref = true;
> vec->is_pfns = false;
> - ret = pin_user_pages_locked(start, nr_frames,
> - gup_flags, (struct page **)(vec->ptrs), &locked);

Should we drop the gup_flags argument, since it's ignored now?

> - goto out;
> + goto out_unlocked;
> }
>

Should we initialize ret with 0 here, since pin_user_pages_fast() can
return a negative error code, but below we use it as a counter for the
looked up frames?

Best regards,
Tomasz

> + mmap_read_lock(mm);
> vec->got_ref = false;
> vec->is_pfns = true;
> do {
> unsigned long *nums = frame_vector_pfns(vec);
>
> + vma = find_vma_intersection(mm, start, start + 1);
> + if (!vma)
> + break;
> +
> while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> err = follow_pfn(vma, start, &nums[ret]);
> if (err) {
> @@ -92,17 +76,13 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> start += PAGE_SIZE;
> ret++;
> }
> - /*
> - * We stop if we have enough pages or if VMA doesn't completely
> - * cover the tail page.
> - */
> - if (ret >= nr_frames || start < vma->vm_end)
> + /* Bail out if VMA doesn't completely cover the tail page. */
> + if (start < vma->vm_end)
> break;
> - vma = find_vma_intersection(mm, start, start + 1);
> - } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
> + } while (ret < nr_frames);
> out:
> - if (locked)
> - mmap_read_unlock(mm);
> + mmap_read_unlock(mm);
> +out_unlocked:
> if (!ret)
> ret = -EFAULT;
> if (ret > 0)
> --
> 2.28.0
>

2020-10-28 01:49:01

by Daniel Vetter

[permalink] [raw]
Subject: Re: [PATCH v4 05/15] mm/frame-vector: Use FOLL_LONGTERM

On Mon, Oct 26, 2020 at 11:15 PM Tomasz Figa <[email protected]> wrote:
>
> Hi Daniel,
>
> On Mon, Oct 26, 2020 at 11:58:08AM +0100, Daniel Vetter wrote:
> > This is used by media/videbuf2 for persistent dma mappings, not just
> > for a single dma operation and then freed again, so needs
> > FOLL_LONGTERM.
> >
> > Unfortunately current pup_locked doesn't support FOLL_LONGTERM due to
> > locking issues. Rework the code to pull the pup path out from the
> > mmap_sem critical section as suggested by Jason.
> >
> > By relying entirely on the vma checks in pin_user_pages and follow_pfn
> > (for vm_flags and vma_is_fsdax) we can also streamline the code a lot.
> >
> > Signed-off-by: Daniel Vetter <[email protected]>
> > Cc: Jason Gunthorpe <[email protected]>
> > Cc: Pawel Osciak <[email protected]>
> > Cc: Marek Szyprowski <[email protected]>
> > Cc: Kyungmin Park <[email protected]>
> > Cc: Tomasz Figa <[email protected]>
> > Cc: Mauro Carvalho Chehab <[email protected]>
> > Cc: Andrew Morton <[email protected]>
> > Cc: John Hubbard <[email protected]>
> > Cc: Jérôme Glisse <[email protected]>
> > Cc: Jan Kara <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Signed-off-by: Daniel Vetter <[email protected]>
> > --
> > v2: Streamline the code and further simplify the loop checks (Jason)
> > ---
> > mm/frame_vector.c | 50 ++++++++++++++---------------------------------
> > 1 file changed, 15 insertions(+), 35 deletions(-)
> >
>
> Thank you for the patch. Please see my comments inline.
>
> > diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> > index 10f82d5643b6..d44779e56313 100644
> > --- a/mm/frame_vector.c
> > +++ b/mm/frame_vector.c
> > @@ -38,7 +38,6 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> > struct vm_area_struct *vma;
> > int ret = 0;
> > int err;
> > - int locked;
> >
> > if (nr_frames == 0)
> > return 0;
> > @@ -48,40 +47,25 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> >
> > start = untagged_addr(start);
> >
> > - mmap_read_lock(mm);
> > - locked = 1;
> > - vma = find_vma_intersection(mm, start, start + 1);
> > - if (!vma) {
> > - ret = -EFAULT;
> > - goto out;
> > - }
> > -
> > - /*
> > - * While get_vaddr_frames() could be used for transient (kernel
> > - * controlled lifetime) pinning of memory pages all current
> > - * users establish long term (userspace controlled lifetime)
> > - * page pinning. Treat get_vaddr_frames() like
> > - * get_user_pages_longterm() and disallow it for filesystem-dax
> > - * mappings.
> > - */
> > - if (vma_is_fsdax(vma)) {
> > - ret = -EOPNOTSUPP;
> > - goto out;
> > - }
> > -
> > - if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
> > + ret = pin_user_pages_fast(start, nr_frames,
> > + FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM,
> > + (struct page **)(vec->ptrs));
> > + if (ret > 0) {
> > vec->got_ref = true;
> > vec->is_pfns = false;
> > - ret = pin_user_pages_locked(start, nr_frames,
> > - gup_flags, (struct page **)(vec->ptrs), &locked);
>
> Should we drop the gup_flags argument, since it's ignored now?

Hm right I think an earlier version even had that, but then I moved to
inlining the functionality in all the places it's used.

I'll drop the gup flag.

> > - goto out;
> > + goto out_unlocked;
> > }
> >
>
> Should we initialize ret with 0 here, since pin_user_pages_fast() can
> return a negative error code, but below we use it as a counter for the
> looked up frames?

Indeed, that's a bug. Will fix for v5.
-Daniel

> Best regards,
> Tomasz
>
> > + mmap_read_lock(mm);
> > vec->got_ref = false;
> > vec->is_pfns = true;
> > do {
> > unsigned long *nums = frame_vector_pfns(vec);
> >
> > + vma = find_vma_intersection(mm, start, start + 1);
> > + if (!vma)
> > + break;
> > +
> > while (ret < nr_frames && start + PAGE_SIZE <= vma->vm_end) {
> > err = follow_pfn(vma, start, &nums[ret]);
> > if (err) {
> > @@ -92,17 +76,13 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
> > start += PAGE_SIZE;
> > ret++;
> > }
> > - /*
> > - * We stop if we have enough pages or if VMA doesn't completely
> > - * cover the tail page.
> > - */
> > - if (ret >= nr_frames || start < vma->vm_end)
> > + /* Bail out if VMA doesn't completely cover the tail page. */
> > + if (start < vma->vm_end)
> > break;
> > - vma = find_vma_intersection(mm, start, start + 1);
> > - } while (vma && vma->vm_flags & (VM_IO | VM_PFNMAP));
> > + } while (ret < nr_frames);
> > out:
> > - if (locked)
> > - mmap_read_unlock(mm);
> > + mmap_read_unlock(mm);
> > +out_unlocked:
> > if (!ret)
> > ret = -EFAULT;
> > if (ret > 0)
> > --
> > 2.28.0
> >



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch