2019-07-22 11:53:28

by Christoph Hellwig

[permalink] [raw]
Subject: hmm_range_fault related fixes and legacy API removal v2

Hi Jérôme, Ben and Jason,

below is a series against the hmm tree which fixes up the mmap_sem
locking in nouveau and while at it also removes leftover legacy HMM APIs
only used by nouveau.

The first 4 patches are a bug fix for nouveau, which I suspect should
go into this merge window even if the code is marked as staging, just
to avoid people copying the breakage.

Changes since v1:
- don't return the valid state from hmm_range_unregister
- additional nouveau cleanups


2019-07-22 11:54:19

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 1/6] mm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

We should not have two different error codes for the same condition. In
addition this really complicates the code due to the special handling of
EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic
in the core vm.

Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Ralph Campbell <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
---
Documentation/vm/hmm.rst | 2 +-
mm/hmm.c | 10 ++++------
2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 7d90964abbb0..710ce1c701bf 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -237,7 +237,7 @@ The usage pattern is::
ret = hmm_range_snapshot(&range);
if (ret) {
up_read(&mm->mmap_sem);
- if (ret == -EAGAIN) {
+ if (ret == -EBUSY) {
/*
* No need to check hmm_range_wait_until_valid() return value
* on retry we will get proper error with hmm_range_snapshot()
diff --git a/mm/hmm.c b/mm/hmm.c
index e1eedef129cf..16b6731a34db 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -946,7 +946,7 @@ EXPORT_SYMBOL(hmm_range_unregister);
* @range: range
* Return: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid
* permission (for instance asking for write and range is read only),
- * -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid
+ * -EBUSY if you need to retry, -EFAULT invalid (ie either no valid
* vma or it is illegal to access that range), number of valid pages
* in range->pfns[] (from range start address).
*
@@ -967,7 +967,7 @@ long hmm_range_snapshot(struct hmm_range *range)
do {
/* If range is no longer valid force retry. */
if (!range->valid)
- return -EAGAIN;
+ return -EBUSY;

vma = find_vma(hmm->mm, start);
if (vma == NULL || (vma->vm_flags & device_vma))
@@ -1062,10 +1062,8 @@ long hmm_range_fault(struct hmm_range *range, bool block)

do {
/* If range is no longer valid force retry. */
- if (!range->valid) {
- up_read(&hmm->mm->mmap_sem);
- return -EAGAIN;
- }
+ if (!range->valid)
+ return -EBUSY;

vma = find_vma(hmm->mm, start);
if (vma == NULL || (vma->vm_flags & device_vma))
--
2.20.1

2019-07-22 11:54:58

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 2/6] mm: move hmm_vma_range_done and hmm_vma_fault to nouveau

These two functions are marked as a legacy APIs to get rid of, but seem
to suit the current nouveau flow. Move it to the only user in
preparation for fixing a locking bug involving caller and callee.
All comments referring to the old API have been removed as this now
is a driver private helper.

Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 45 +++++++++++++++++++++-
include/linux/hmm.h | 54 ---------------------------
2 files changed, 43 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 8c92374afcf2..cde09003c06b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -475,6 +475,47 @@ nouveau_svm_fault_cache(struct nouveau_svm *svm,
fault->inst, fault->addr, fault->access);
}

+static inline bool nouveau_range_done(struct hmm_range *range)
+{
+ bool ret = hmm_range_valid(range);
+
+ hmm_range_unregister(range);
+ return ret;
+}
+
+static int
+nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range,
+ bool block)
+{
+ long ret;
+
+ range->default_flags = 0;
+ range->pfn_flags_mask = -1UL;
+
+ ret = hmm_range_register(range, mirror,
+ range->start, range->end,
+ PAGE_SHIFT);
+ if (ret)
+ return (int)ret;
+
+ if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
+ up_read(&range->vma->vm_mm->mmap_sem);
+ return -EAGAIN;
+ }
+
+ ret = hmm_range_fault(range, block);
+ if (ret <= 0) {
+ if (ret == -EBUSY || !ret) {
+ up_read(&range->vma->vm_mm->mmap_sem);
+ ret = -EBUSY;
+ } else if (ret == -EAGAIN)
+ ret = -EBUSY;
+ hmm_range_unregister(range);
+ return ret;
+ }
+ return 0;
+}
+
static int
nouveau_svm_fault(struct nvif_notify *notify)
{
@@ -649,10 +690,10 @@ nouveau_svm_fault(struct nvif_notify *notify)
range.values = nouveau_svm_pfn_values;
range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
again:
- ret = hmm_vma_fault(&svmm->mirror, &range, true);
+ ret = nouveau_range_fault(&svmm->mirror, &range, true);
if (ret == 0) {
mutex_lock(&svmm->mutex);
- if (!hmm_vma_range_done(&range)) {
+ if (!nouveau_range_done(&range)) {
mutex_unlock(&svmm->mutex);
goto again;
}
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index b8a08b2a10ca..7ef56dc18050 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -484,60 +484,6 @@ long hmm_range_dma_unmap(struct hmm_range *range,
*/
#define HMM_RANGE_DEFAULT_TIMEOUT 1000

-/* This is a temporary helper to avoid merge conflict between trees. */
-static inline bool hmm_vma_range_done(struct hmm_range *range)
-{
- bool ret = hmm_range_valid(range);
-
- hmm_range_unregister(range);
- return ret;
-}
-
-/* This is a temporary helper to avoid merge conflict between trees. */
-static inline int hmm_vma_fault(struct hmm_mirror *mirror,
- struct hmm_range *range, bool block)
-{
- long ret;
-
- /*
- * With the old API the driver must set each individual entries with
- * the requested flags (valid, write, ...). So here we set the mask to
- * keep intact the entries provided by the driver and zero out the
- * default_flags.
- */
- range->default_flags = 0;
- range->pfn_flags_mask = -1UL;
-
- ret = hmm_range_register(range, mirror,
- range->start, range->end,
- PAGE_SHIFT);
- if (ret)
- return (int)ret;
-
- if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
- /*
- * The mmap_sem was taken by driver we release it here and
- * returns -EAGAIN which correspond to mmap_sem have been
- * drop in the old API.
- */
- up_read(&range->vma->vm_mm->mmap_sem);
- return -EAGAIN;
- }
-
- ret = hmm_range_fault(range, block);
- if (ret <= 0) {
- if (ret == -EBUSY || !ret) {
- /* Same as above, drop mmap_sem to match old API. */
- up_read(&range->vma->vm_mm->mmap_sem);
- ret = -EBUSY;
- } else if (ret == -EAGAIN)
- ret = -EBUSY;
- hmm_range_unregister(range);
- return ret;
- }
- return 0;
-}
-
/* Below are for HMM internal use only! Not to be used by device driver! */
static inline void hmm_mm_init(struct mm_struct *mm)
{
--
2.20.1

2019-07-22 11:55:42

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
mmap_sem, but the latter unlocks it for a random selection of error
codes. Fix this up by always unlocking mmap_sem for non-zero return
values in nouveau_range_fault, and only unlocking it in the caller
for successful returns.

Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 5dd83a46578f..5de2d54b9782 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -494,8 +494,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)
ret = hmm_range_register(range, mirror,
range->start, range->end,
PAGE_SHIFT);
- if (ret)
+ if (ret) {
+ up_read(&range->vma->vm_mm->mmap_sem);
return (int)ret;
+ }

if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
up_read(&range->vma->vm_mm->mmap_sem);
@@ -504,11 +506,9 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)

ret = hmm_range_fault(range, true);
if (ret <= 0) {
- if (ret == -EBUSY || !ret) {
- up_read(&range->vma->vm_mm->mmap_sem);
- ret = -EBUSY;
- } else if (ret == -EAGAIN)
+ if (ret == 0)
ret = -EBUSY;
+ up_read(&range->vma->vm_mm->mmap_sem);
hmm_range_unregister(range);
return ret;
}
@@ -706,8 +706,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
NULL);
svmm->vmm->vmm.object.client->super = false;
mutex_unlock(&svmm->mutex);
+ up_read(&svmm->mm->mmap_sem);
}
- up_read(&svmm->mm->mmap_sem);

/* Cancel any faults in the window whose pages didn't manage
* to keep their valid bit, or stay writeable when required.
--
2.20.1

2019-07-22 11:55:47

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 5/6] nouveau: return -EBUSY when hmm_range_wait_until_valid fails

-EAGAIN has a magic meaning for non-blocking faults, so don't overload
it. Given that the caller doesn't check for specific error codes this
change is purely cosmetic.

Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 5de2d54b9782..a9c5c58d425b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -501,7 +501,7 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)

if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
up_read(&range->vma->vm_mm->mmap_sem);
- return -EAGAIN;
+ return -EBUSY;
}

ret = hmm_range_fault(range, true);
--
2.20.1

2019-07-22 17:28:03

by Souptick Joarder

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

On Mon, Jul 22, 2019 at 3:14 PM Christoph Hellwig <[email protected]> wrote:
>
> We should not have two different error codes for the same condition. In
> addition this really complicates the code due to the special handling of
> EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic
> in the core vm.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> Reviewed-by: Ralph Campbell <[email protected]>
> Reviewed-by: Felix Kuehling <[email protected]>
> ---
> Documentation/vm/hmm.rst | 2 +-
> mm/hmm.c | 10 ++++------
> 2 files changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> index 7d90964abbb0..710ce1c701bf 100644
> --- a/Documentation/vm/hmm.rst
> +++ b/Documentation/vm/hmm.rst
> @@ -237,7 +237,7 @@ The usage pattern is::
> ret = hmm_range_snapshot(&range);
> if (ret) {
> up_read(&mm->mmap_sem);
> - if (ret == -EAGAIN) {
> + if (ret == -EBUSY) {
> /*
> * No need to check hmm_range_wait_until_valid() return value
> * on retry we will get proper error with hmm_range_snapshot()
> diff --git a/mm/hmm.c b/mm/hmm.c
> index e1eedef129cf..16b6731a34db 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -946,7 +946,7 @@ EXPORT_SYMBOL(hmm_range_unregister);
> * @range: range
> * Return: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid
> * permission (for instance asking for write and range is read only),
> - * -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid
> + * -EBUSY if you need to retry, -EFAULT invalid (ie either no valid
> * vma or it is illegal to access that range), number of valid pages
> * in range->pfns[] (from range start address).
> *
> @@ -967,7 +967,7 @@ long hmm_range_snapshot(struct hmm_range *range)
> do {
> /* If range is no longer valid force retry. */
> if (!range->valid)
> - return -EAGAIN;
> + return -EBUSY;
>
> vma = find_vma(hmm->mm, start);
> if (vma == NULL || (vma->vm_flags & device_vma))
> @@ -1062,10 +1062,8 @@ long hmm_range_fault(struct hmm_range *range, bool block)
>
> do {
> /* If range is no longer valid force retry. */
> - if (!range->valid) {
> - up_read(&hmm->mm->mmap_sem);
> - return -EAGAIN;
> - }
> + if (!range->valid)
> + return -EBUSY;

Is it fine to remove up_read(&hmm->mm->mmap_sem) ?

>
> vma = find_vma(hmm->mm, start);
> if (vma == NULL || (vma->vm_flags & device_vma))
> --
> 2.20.1
>

2019-07-23 08:48:32

by Ralph Campbell

[permalink] [raw]
Subject: Re: hmm_range_fault related fixes and legacy API removal v2


On 7/22/19 2:44 AM, Christoph Hellwig wrote:
> Hi Jérôme, Ben and Jason,
>
> below is a series against the hmm tree which fixes up the mmap_sem
> locking in nouveau and while at it also removes leftover legacy HMM APIs
> only used by nouveau.
>
> The first 4 patches are a bug fix for nouveau, which I suspect should
> go into this merge window even if the code is marked as staging, just
> to avoid people copying the breakage.
>
> Changes since v1:
> - don't return the valid state from hmm_range_unregister
> - additional nouveau cleanups
>

I ran some OpenCL tests from Jerome with nouveau and this series,
5.3.0-rc1, and my two HMM fixes:
("mm/hmm: fix ZONE_DEVICE anon page mapping reuse")
("mm/hmm: Fix bad subpage pointer in try_to_unmap_one")

You can add for the series:
Tested-by: Ralph Campbell <[email protected]>

2019-07-24 00:20:15

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 2/6] mm: move hmm_vma_range_done and hmm_vma_fault to nouveau

On Mon, Jul 22, 2019 at 11:44:22AM +0200, Christoph Hellwig wrote:
> These two functions are marked as a legacy APIs to get rid of, but seem
> to suit the current nouveau flow. Move it to the only user in
> preparation for fixing a locking bug involving caller and callee.
> All comments referring to the old API have been removed as this now
> is a driver private helper.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 45 +++++++++++++++++++++-
> include/linux/hmm.h | 54 ---------------------------
> 2 files changed, 43 insertions(+), 56 deletions(-)

Reviewed-by: Jason Gunthorpe <[email protected]>

Jason

2019-07-24 00:37:42

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 5/6] nouveau: return -EBUSY when hmm_range_wait_until_valid fails

On Mon, Jul 22, 2019 at 11:44:25AM +0200, Christoph Hellwig wrote:
> -EAGAIN has a magic meaning for non-blocking faults, so don't overload
> it. Given that the caller doesn't check for specific error codes this
> change is purely cosmetic.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Agree

Reviewed-by: Jason Gunthorpe <[email protected]>

Jason

2019-07-24 00:37:55

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: hmm_range_fault related fixes and legacy API removal v2

On Mon, Jul 22, 2019 at 06:11:04PM -0700, Ralph Campbell wrote:
>
> On 7/22/19 2:44 AM, Christoph Hellwig wrote:
> > Hi Jérôme, Ben and Jason,
> >
> > below is a series against the hmm tree which fixes up the mmap_sem
> > locking in nouveau and while at it also removes leftover legacy HMM APIs
> > only used by nouveau.
> >
> > The first 4 patches are a bug fix for nouveau, which I suspect should
> > go into this merge window even if the code is marked as staging, just
> > to avoid people copying the breakage.
> >
> > Changes since v1:
> > - don't return the valid state from hmm_range_unregister
> > - additional nouveau cleanups
> >
>
> I ran some OpenCL tests from Jerome with nouveau and this series,
> 5.3.0-rc1, and my two HMM fixes:
> ("mm/hmm: fix ZONE_DEVICE anon page mapping reuse")
> ("mm/hmm: Fix bad subpage pointer in try_to_unmap_one")
>
> You can add for the series:
> Tested-by: Ralph Campbell <[email protected]>

Thanks, would you also rebase and resend the patch deleting
struct hmm_update ?

Jason

2019-07-24 00:38:54

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: hmm_range_fault related fixes and legacy API removal v2

On Mon, Jul 22, 2019 at 11:44:20AM +0200, Christoph Hellwig wrote:
> Hi Jérôme, Ben and Jason,
>
> below is a series against the hmm tree which fixes up the mmap_sem
> locking in nouveau and while at it also removes leftover legacy HMM APIs
> only used by nouveau.
>
> The first 4 patches are a bug fix for nouveau, which I suspect should
> go into this merge window even if the code is marked as staging, just
> to avoid people copying the breakage.

Ignoring the STAGING issue I've tried to use the same guideline as for
-stable for -rc ..

So this is a real problem, we definitely hit the locking bugs if we
retry/etc under stress, so I would be OK to send it to Linus for
early-rc.

However, it doesn't look like the 1st patch is fixing a current bug
though, the only callers uses blocking = true, so just the middle
three are -rc?

Thanks,
Jason

2019-07-24 01:30:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

On Tue, Jul 23, 2019 at 02:54:45PM +0000, Jason Gunthorpe wrote:
> I think without the commit message I wouldn't have been able to
> understand that, so Christoph, could you also add the comment below
> please?

I don't think this belongs into this patch. I can add it as a separate
patch under your name and with your signoff if you are ok with that.

2019-07-24 01:33:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: hmm_range_fault related fixes and legacy API removal v2

On Tue, Jul 23, 2019 at 03:27:41PM +0000, Jason Gunthorpe wrote:
> Ignoring the STAGING issue I've tried to use the same guideline as for
> -stable for -rc ..
>
> So this is a real problem, we definitely hit the locking bugs if we
> retry/etc under stress, so I would be OK to send it to Linus for
> early-rc.
>
> However, it doesn't look like the 1st patch is fixing a current bug
> though, the only callers uses blocking = true, so just the middle
> three are -rc?

nonblocking isn't used anywher, but it is a major, major API bug.
Your call, but if it was my tree I'd probably send it to Linus.

2019-07-24 01:33:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On Tue, Jul 23, 2019 at 03:18:28PM +0000, Jason Gunthorpe wrote:
> Hum..
>
> The caller does this:
>
> again:
> ret = nouveau_range_fault(&svmm->mirror, &range);
> if (ret == 0) {
> mutex_lock(&svmm->mutex);
> if (!nouveau_range_done(&range)) {
> mutex_unlock(&svmm->mutex);
> goto again;
>
> And we can't call nouveau_range_fault() -> hmm_range_fault() without
> holding the mmap_sem, so we can't allow nouveau_range_fault to unlock
> it.

Goto again can only happen if nouveau_range_fault was successful,
in which case we did not drop mmap_sem.

Also:

> ret = hmm_range_fault(range, true);
> if (ret <= 0) {
> if (ret == 0)
> ret = -EBUSY;
> - up_read(&range->vma->vm_mm->mmap_sem);
> hmm_range_unregister(range);

This would hold mmap_sem over hmm_range_unregister, which can lead
to deadlock if we call exit_mmap and then acquire mmap_sem again.

2019-07-24 01:43:23

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On Tue, Jul 23, 2019 at 02:17:31PM -0300, Jason Gunthorpe wrote:
> That reminds me, this code is also leaking hmm_range_unregister() in
> the success path, right?

No, that is done by hmm_vma_range_done / nouveau_range_done for the
success path.

>
> I think the right way to structure this is to move the goto again and
> related into the nouveau_range_fault() so the whole retry algorithm is
> sensibly self contained.

Then we'd take svmm->mutex inside the helper and let the caller
unlock that. Either way it is a bit of a mess, and I'd rather prefer
if someone has the hardware would do a grand rewrite of this path
eventually. Alternatively if no one signs up to mainain this code
we should eventually drop it given the staging status.

2019-07-24 02:19:28

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

On Mon, Jul 22, 2019 at 08:07:33PM +0530, Souptick Joarder wrote:
> On Mon, Jul 22, 2019 at 3:14 PM Christoph Hellwig <[email protected]> wrote:
> >
> > We should not have two different error codes for the same condition. In
> > addition this really complicates the code due to the special handling of
> > EAGAIN that drops the mmap_sem due to the FAULT_FLAG_ALLOW_RETRY logic
> > in the core vm.
> >
> > Signed-off-by: Christoph Hellwig <[email protected]>
> > Reviewed-by: Ralph Campbell <[email protected]>
> > Reviewed-by: Felix Kuehling <[email protected]>
> > Documentation/vm/hmm.rst | 2 +-
> > mm/hmm.c | 10 ++++------
> > 2 files changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> > index 7d90964abbb0..710ce1c701bf 100644
> > +++ b/Documentation/vm/hmm.rst
> > @@ -237,7 +237,7 @@ The usage pattern is::
> > ret = hmm_range_snapshot(&range);
> > if (ret) {
> > up_read(&mm->mmap_sem);
> > - if (ret == -EAGAIN) {
> > + if (ret == -EBUSY) {
> > /*
> > * No need to check hmm_range_wait_until_valid() return value
> > * on retry we will get proper error with hmm_range_snapshot()
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index e1eedef129cf..16b6731a34db 100644
> > +++ b/mm/hmm.c
> > @@ -946,7 +946,7 @@ EXPORT_SYMBOL(hmm_range_unregister);
> > * @range: range
> > * Return: -EINVAL if invalid argument, -ENOMEM out of memory, -EPERM invalid
> > * permission (for instance asking for write and range is read only),
> > - * -EAGAIN if you need to retry, -EFAULT invalid (ie either no valid
> > + * -EBUSY if you need to retry, -EFAULT invalid (ie either no valid
> > * vma or it is illegal to access that range), number of valid pages
> > * in range->pfns[] (from range start address).
> > *
> > @@ -967,7 +967,7 @@ long hmm_range_snapshot(struct hmm_range *range)
> > do {
> > /* If range is no longer valid force retry. */
> > if (!range->valid)
> > - return -EAGAIN;
> > + return -EBUSY;
> >
> > vma = find_vma(hmm->mm, start);
> > if (vma == NULL || (vma->vm_flags & device_vma))
> > @@ -1062,10 +1062,8 @@ long hmm_range_fault(struct hmm_range *range, bool block)
> >
> > do {
> > /* If range is no longer valid force retry. */
> > - if (!range->valid) {
> > - up_read(&hmm->mm->mmap_sem);
> > - return -EAGAIN;
> > - }
> > + if (!range->valid)
> > + return -EBUSY;
>
> Is it fine to remove up_read(&hmm->mm->mmap_sem) ?

It seems very subtle, but under the covers this calls
handle_mm_fault() with FAULT_FLAG_ALLOW_RETRY which will cause the
mmap sem to become unlocked along the -EAGAIN return path.

I think without the commit message I wouldn't have been able to
understand that, so Christoph, could you also add the comment below
please?

Otherwise

Reviewed-by: Jason Gunthorpe <[email protected]>

Frankly, I find the 'int *locked' scheme of GUP easier to understand..

diff --git a/mm/hmm.c b/mm/hmm.c
index 16b6731a34db79..54b3a4162ae949 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -301,8 +301,10 @@ static int hmm_vma_do_fault(struct mm_walk *walk, unsigned long addr,
flags |= hmm_vma_walk->block ? 0 : FAULT_FLAG_ALLOW_RETRY;
flags |= write_fault ? FAULT_FLAG_WRITE : 0;
ret = handle_mm_fault(vma, addr, flags);
- if (ret & VM_FAULT_RETRY)
+ if (ret & VM_FAULT_RETRY) {
+ /* Note, handle_mm_fault did up_read(&mm->mmap_sem)) */
return -EAGAIN;
+ }
if (ret & VM_FAULT_ERROR) {
*pfn = range->values[HMM_PFN_ERROR];
return -EFAULT;

2019-07-24 02:22:33

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On Mon, Jul 22, 2019 at 11:44:24AM +0200, Christoph Hellwig wrote:
> Currently nouveau_svm_fault expects nouveau_range_fault to never unlock
> mmap_sem, but the latter unlocks it for a random selection of error
> codes. Fix this up by always unlocking mmap_sem for non-zero return
> values in nouveau_range_fault, and only unlocking it in the caller
> for successful returns.
>
> Signed-off-by: Christoph Hellwig <[email protected]>
> drivers/gpu/drm/nouveau/nouveau_svm.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 5dd83a46578f..5de2d54b9782 100644
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -494,8 +494,10 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)
> ret = hmm_range_register(range, mirror,
> range->start, range->end,
> PAGE_SHIFT);
> - if (ret)
> + if (ret) {
> + up_read(&range->vma->vm_mm->mmap_sem);
> return (int)ret;
> + }
>
> if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
> up_read(&range->vma->vm_mm->mmap_sem);
> @@ -504,11 +506,9 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)
>
> ret = hmm_range_fault(range, true);
> if (ret <= 0) {
> - if (ret == -EBUSY || !ret) {
> - up_read(&range->vma->vm_mm->mmap_sem);
> - ret = -EBUSY;
> - } else if (ret == -EAGAIN)
> + if (ret == 0)
> ret = -EBUSY;
> + up_read(&range->vma->vm_mm->mmap_sem);
> hmm_range_unregister(range);
> return ret;

Hum..

The caller does this:

again:
ret = nouveau_range_fault(&svmm->mirror, &range);
if (ret == 0) {
mutex_lock(&svmm->mutex);
if (!nouveau_range_done(&range)) {
mutex_unlock(&svmm->mutex);
goto again;

And we can't call nouveau_range_fault() -> hmm_range_fault() without
holding the mmap_sem, so we can't allow nouveau_range_fault to unlock
it.

Maybe this instead?

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index a9c5c58d425b3d..92cf760a9bcc5d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -494,21 +494,16 @@ nouveau_range_fault(struct hmm_mirror *mirror, struct hmm_range *range)
ret = hmm_range_register(range, mirror,
range->start, range->end,
PAGE_SHIFT);
- if (ret) {
- up_read(&range->vma->vm_mm->mmap_sem);
+ if (ret)
return (int)ret;
- }

- if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT)) {
- up_read(&range->vma->vm_mm->mmap_sem);
+ if (!hmm_range_wait_until_valid(range, HMM_RANGE_DEFAULT_TIMEOUT))
return -EBUSY;
- }

ret = hmm_range_fault(range, true);
if (ret <= 0) {
if (ret == 0)
ret = -EBUSY;
- up_read(&range->vma->vm_mm->mmap_sem);
hmm_range_unregister(range);
return ret;
}
@@ -706,8 +701,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
NULL);
svmm->vmm->vmm.object.client->super = false;
mutex_unlock(&svmm->mutex);
- up_read(&svmm->mm->mmap_sem);
}
+ up_read(&svmm->mm->mmap_sem);

/* Cancel any faults in the window whose pages didn't manage
* to keep their valid bit, or stay writeable when required.

2019-07-24 02:25:02

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/6] mm: always return EBUSY for invalid ranges in hmm_range_{fault,snapshot}

On Tue, Jul 23, 2019 at 06:19:07PM +0200, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 02:54:45PM +0000, Jason Gunthorpe wrote:
> > I think without the commit message I wouldn't have been able to
> > understand that, so Christoph, could you also add the comment below
> > please?
>
> I don't think this belongs into this patch. I can add it as a separate
> patch under your name and with your signoff if you are ok with that.

Yep, thanks

Jason

2019-07-24 02:25:09

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On Tue, Jul 23, 2019 at 07:23:35PM +0200, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 02:17:31PM -0300, Jason Gunthorpe wrote:
> > That reminds me, this code is also leaking hmm_range_unregister() in
> > the success path, right?
>
> No, that is done by hmm_vma_range_done / nouveau_range_done for the
> success path.

.. which is done with the mmap_sem held :(

> > I think the right way to structure this is to move the goto again and
> > related into the nouveau_range_fault() so the whole retry algorithm is
> > sensibly self contained.
>
> Then we'd take svmm->mutex inside the helper and let the caller
> unlock that. Either way it is a bit of a mess, and I'd rather prefer
> if someone has the hardware would do a grand rewrite of this path
> eventually. Alternatively if no one signs up to mainain this code
> we should eventually drop it given the staging status.

I tend to agree with the sentiment, it just makes me sad that all the
examples we have of these APIs are so troubled.

Jason

2019-07-24 02:25:55

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 4/6] nouveau: unlock mmap_sem on all errors from nouveau_range_fault

On Tue, Jul 23, 2019 at 06:30:48PM +0200, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 03:18:28PM +0000, Jason Gunthorpe wrote:
> > Hum..
> >
> > The caller does this:
> >
> > again:
> > ret = nouveau_range_fault(&svmm->mirror, &range);
> > if (ret == 0) {
> > mutex_lock(&svmm->mutex);
> > if (!nouveau_range_done(&range)) {
> > mutex_unlock(&svmm->mutex);
> > goto again;
> >
> > And we can't call nouveau_range_fault() -> hmm_range_fault() without
> > holding the mmap_sem, so we can't allow nouveau_range_fault to unlock
> > it.
>
> Goto again can only happen if nouveau_range_fault was successful,
> in which case we did not drop mmap_sem.

Oh, right we switch from success = number of pages to success =0..

However the reason this looks so weird to me is that the locking
pattern isn't being followed, any result of hmm_range_fault should be
ignored until we enter the svmm->mutex and check if there was a
colliding invalidation.

So the 'goto again' *should* be possible even if range_fault failed.

But that is not for this patch..

> > ret = hmm_range_fault(range, true);
> > if (ret <= 0) {
> > if (ret == 0)
> > ret = -EBUSY;
> > - up_read(&range->vma->vm_mm->mmap_sem);
> > hmm_range_unregister(range);
>
> This would hold mmap_sem over hmm_range_unregister, which can lead
> to deadlock if we call exit_mmap and then acquire mmap_sem again.

That reminds me, this code is also leaking hmm_range_unregister() in
the success path, right?

I think the right way to structure this is to move the goto again and
related into the nouveau_range_fault() so the whole retry algorithm is
sensibly self contained.

Jason