by Jason Gunthorpe

[permalink] [raw]

Subject: Re: [PATCH hmm 2/5] mm/hmm: make hmm_range_fault return 0 or -1

On Wed, Apr 22, 2020 at 07:52:29AM +0200, Christoph Hellwig wrote:
> On Tue, Apr 21, 2020 at 09:21:43PM -0300, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <[email protected]>
> >
> > hmm_vma_walk->last is supposed to be updated after every write to the
> > pfns, so that it can be returned by hmm_range_fault(). However, this is
> > not done consistently. Fortunately nothing checks the return code of
> > hmm_range_fault() for anything other than error.
> >
> > More importantly last must be set before returning -EBUSY as it is used to
> > prevent reading an output pfn as an input flags when the loop restarts.
> >
> > For clarity and simplicity make hmm_range_fault() return 0 or -ERRNO. Only
> > set last when returning -EBUSY.
> >
> > Signed-off-by: Jason Gunthorpe <[email protected]>
> > Documentation/vm/hmm.rst | 2 +-
> > drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++--
> > drivers/gpu/drm/nouveau/nouveau_svm.c | 6 +++---
> > include/linux/hmm.h | 2 +-
> > mm/hmm.c | 25 +++++++++----------------
> > 5 files changed, 16 insertions(+), 23 deletions(-)
> >
> > diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
> > index 4e3e9362afeb10..9924f2caa0184c 100644
> > +++ b/Documentation/vm/hmm.rst
> > @@ -161,7 +161,7 @@ device must complete the update before the driver callback returns.
> > When the device driver wants to populate a range of virtual addresses, it can
> > use::
> >
> > - long hmm_range_fault(struct hmm_range *range);
> > + int hmm_range_fault(struct hmm_range *range);
> >
> > It will trigger a page fault on missing or read-only entries if write access is
> > requested (see below). Page faults use the generic mm page fault code path just
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > index 6309ff72bd7876..efc1329a019127 100644
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> > @@ -852,12 +852,12 @@ int amdgpu_ttm_tt_get_user_pages(struct amdgpu_bo *bo, struct page **pages)
> > down_read(&mm->mmap_sem);
> > r = hmm_range_fault(range);
> > up_read(&mm->mmap_sem);
> > - if (unlikely(r <= 0)) {
> > + if (unlikely(r)) {
> > /*
> > * FIXME: This timeout should encompass the retry from
> > * mmu_interval_read_retry() as well.
> > */
> > - if ((r == 0 || r == -EBUSY) && !time_after(jiffies, timeout))
> > + if ((r == -EBUSY) && !time_after(jiffies, timeout))
>
> Please also kill the superflous inner braces here.
>
> > + * Return: 0 or -ERRNO with one of the following status codes:
>
> Maybe say something like:
>
> * Returns 0 on success or one of the following error codes:
>
> Otherwise this looks good:

Got it, thanks

Jason

2020-04-29 22:45:13

by Jason Gunthorpe

[permalink] [raw]

Subject: Re: [PATCH hmm 5/5] mm/hmm: remove the customizable pfn format from hmm_range_fault

On Wed, Apr 22, 2020 at 01:52:32PM -0400, Felix Kuehling wrote:
> [+Philip Yang]
>
> Am 2020-04-21 um 8:21 p.m. schrieb Jason Gunthorpe:
> > From: Jason Gunthorpe <[email protected]>
> >
> > Presumably the intent here was that hmm_range_fault() could put the data
> > into some HW specific format and thus avoid some work. However, nothing
> > actually does that, and it isn't clear how anything actually could do that
> > as hmm_range_fault() provides CPU addresses which must be DMA mapped.
> >
> > Perhaps there is some special HW that does not need DMA mapping, but we
> > don't have any examples of this, and the theoretical performance win of
> > avoiding an extra scan over the pfns array doesn't seem worth the
> > complexity. Plus pfns needs to be scanned anyhow to sort out any
> > DEVICE_PRIVATE pages.
> >
> > This version replaces the uint64_t with an usigned long containing a pfn
> > and fix flags. On input flags is filled with the HMM_PFN_REQ_* values, on
> > successful output it is filled with HMM_PFN_* values, describing the state
> > of the pages.
> >
> > amdgpu is simple to convert, it doesn't use snapshot and doesn't use
> > per-page flags.
> >
> > nouveau uses only 16 hmm_pte entries at most (ie fits in a few cache
> > lines), and it sweeps over its pfns array a couple of times anyhow.
> >
> > Signed-off-by: Jason Gunthorpe <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Hi Jason,
>
> I pointed out a typo in the documentation inline. Other than that, the
> series is
>
> Acked-by: Felix Kuehling <[email protected]>
>
> I'll try to build it and run some basic tests later.

Got it, thanks! Let me know if there are problems

Jason