This code was using get_user_pages_fast(), in a "Case 2" scenario
(DMA/RDMA), using the categorization from [1]. That means that it's
time to convert the get_user_pages_fast() + put_page() calls to
pin_user_pages_fast() + unpin_user_page() calls.
There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.
[1] Documentation/core-api/pin_user_pages.rst
[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/
Signed-off-by: Souptick Joarder <[email protected]>
Cc: John Hubbard <[email protected]>
Hi,
I'm compile tested this, but unable to run-time test, so any testing
help is much appriciated.
---
drivers/staging/gasket/gasket_page_table.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/gasket/gasket_page_table.c b/drivers/staging/gasket/gasket_page_table.c
index f6d7157..d712ad4 100644
--- a/drivers/staging/gasket/gasket_page_table.c
+++ b/drivers/staging/gasket/gasket_page_table.c
@@ -449,7 +449,7 @@ static bool gasket_release_page(struct page *page)
if (!PageReserved(page))
SetPageDirty(page);
- put_page(page);
+ unpin_user_page(page);
return true;
}
@@ -486,12 +486,12 @@ static int gasket_perform_mapping(struct gasket_page_table *pg_tbl,
ptes[i].dma_addr = pg_tbl->coherent_pages[0].paddr +
off + i * PAGE_SIZE;
} else {
- ret = get_user_pages_fast(page_addr - offset, 1,
+ ret = pin_user_pages_fast(page_addr - offset, 1,
FOLL_WRITE, &page);
if (ret <= 0) {
dev_err(pg_tbl->device,
- "get user pages failed for addr=0x%lx, offset=0x%lx [ret=%d]\n",
+ "pin user pages failed for addr=0x%lx, offset=0x%lx [ret=%d]\n",
page_addr, offset, ret);
return ret ? ret : -ENOMEM;
}
--
1.9.1
On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
> This code was using get_user_pages_fast(), in a "Case 2" scenario
> (DMA/RDMA), using the categorization from [1]. That means that it's
> time to convert the get_user_pages_fast() + put_page() calls to
> pin_user_pages_fast() + unpin_user_page() calls.
You are saying that the page is used for DIO and not DMA, but it sure
looks to me like it is used for DMA.
503 /* Map the page into DMA space. */
504 ptes[i].dma_addr =
505 dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE,
506 DMA_BIDIRECTIONAL);
To be honest, that starting paragraph was confusing. At first I thought
you were saying gasket was an RDMA driver. :P I shouldn't have to read
a different document to understand the commit message. It should be
summarized enough and the other documentation is supplemental.
"In 2019 we introduced pin_user_pages() and now we are converting
get_user_pages() to the new API as appropriate".
>
> There is some helpful background in [2]: basically, this is a small
> part of fixing a long-standing disconnect between pinning pages, and
> file systems' use of those pages.
What is the impact of this patch on runtime?
>
> [1] Documentation/core-api/pin_user_pages.rst
>
> [2] "Explicit pinning of user-space pages":
> https://lwn.net/Articles/807108/
>
> Signed-off-by: Souptick Joarder <[email protected]>
> Cc: John Hubbard <[email protected]>
>
> Hi,
>
> I'm compile tested this, but unable to run-time test, so any testing
> help is much appriciated.
> ---
The "Hi" part of patch should have been under the "---" cut off line so
this will definitely need to be resent.
regards,
dan carpenter
On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
>
> On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
> > This code was using get_user_pages_fast(), in a "Case 2" scenario
> > (DMA/RDMA), using the categorization from [1]. That means that it's
> > time to convert the get_user_pages_fast() + put_page() calls to
> > pin_user_pages_fast() + unpin_user_page() calls.
>
> You are saying that the page is used for DIO and not DMA, but it sure
> looks to me like it is used for DMA.
No, I was referring to "Case 2" scenario in change log which means it is
used for DMA, not DIO.
>
> 503 /* Map the page into DMA space. */
> 504 ptes[i].dma_addr =
> 505 dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE,
> 506 DMA_BIDIRECTIONAL);
>
> To be honest, that starting paragraph was confusing. At first I thought
> you were saying gasket was an RDMA driver. :P I shouldn't have to read
> a different document to understand the commit message. It should be
> summarized enough and the other documentation is supplemental.
>
> "In 2019 we introduced pin_user_pages() and now we are converting
> get_user_pages() to the new API as appropriate".
As all other similar conversion have similar change logs, so I was trying
to maintain the same. John might have a different opinion on this.
John, Any further opinion ??
>
> >
> > There is some helpful background in [2]: basically, this is a small
> > part of fixing a long-standing disconnect between pinning pages, and
> > file systems' use of those pages.
>
> What is the impact of this patch on runtime?
I don't have the hardware to validate the runtime impact and will
wait if someone is going to validate it for runtime impact.
>
> >
> > [1] Documentation/core-api/pin_user_pages.rst
> >
> > [2] "Explicit pinning of user-space pages":
> > https://lwn.net/Articles/807108/
> >
> > Signed-off-by: Souptick Joarder <[email protected]>
> > Cc: John Hubbard <[email protected]>
> >
> > Hi,
> >
> > I'm compile tested this, but unable to run-time test, so any testing
> > help is much appriciated.
> > ---
>
> The "Hi" part of patch should have been under the "---" cut off line so
> this will definitely need to be resent.
Sorry about it.
Will wait for feedback from John before resend it :)
>
> regards,
> dan carpenter
>
On Fri, May 29, 2020 at 11:46 AM Souptick Joarder <[email protected]> wrote:
>
> On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
> >
> > On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
> > > This code was using get_user_pages_fast(), in a "Case 2" scenario
> > > (DMA/RDMA), using the categorization from [1]. That means that it's
> > > time to convert the get_user_pages_fast() + put_page() calls to
> > > pin_user_pages_fast() + unpin_user_page() calls.
> >
> > You are saying that the page is used for DIO and not DMA, but it sure
> > looks to me like it is used for DMA.
>
> No, I was referring to "Case 2" scenario in change log which means it is
> used for DMA, not DIO.
>
> >
> > 503 /* Map the page into DMA space. */
> > 504 ptes[i].dma_addr =
> > 505 dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE,
> > 506 DMA_BIDIRECTIONAL);
> >
> > To be honest, that starting paragraph was confusing. At first I thought
> > you were saying gasket was an RDMA driver. :P I shouldn't have to read
> > a different document to understand the commit message. It should be
> > summarized enough and the other documentation is supplemental.
> >
> > "In 2019 we introduced pin_user_pages() and now we are converting
> > get_user_pages() to the new API as appropriate".
>
> As all other similar conversion have similar change logs, so I was trying
> to maintain the same. John might have a different opinion on this.
For example, I was referring to few recent similar commits for change logs.
http://lkml.kernel.org/r/[email protected]
https://lore.kernel.org/r/[email protected]
>
> John, Any further opinion ??
>
> >
> > >
> > > There is some helpful background in [2]: basically, this is a small
> > > part of fixing a long-standing disconnect between pinning pages, and
> > > file systems' use of those pages.
> >
> > What is the impact of this patch on runtime?
>
> I don't have the hardware to validate the runtime impact and will
> wait if someone is going to validate it for runtime impact.
>
> >
> > >
> > > [1] Documentation/core-api/pin_user_pages.rst
> > >
> > > [2] "Explicit pinning of user-space pages":
> > > https://lwn.net/Articles/807108/
> > >
> > > Signed-off-by: Souptick Joarder <[email protected]>
> > > Cc: John Hubbard <[email protected]>
> > >
> > > Hi,
> > >
> > > I'm compile tested this, but unable to run-time test, so any testing
> > > help is much appriciated.
> > > ---
> >
> > The "Hi" part of patch should have been under the "---" cut off line so
> > this will definitely need to be resent.
>
> Sorry about it.
> Will wait for feedback from John before resend it :)
>
> >
> > regards,
> > dan carpenter
> >
On 2020-05-28 23:27, Souptick Joarder wrote:
> On Fri, May 29, 2020 at 11:46 AM Souptick Joarder <[email protected]> wrote:
>>
>> On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
>>>
>>> On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
>>>> This code was using get_user_pages_fast(), in a "Case 2" scenario
>>>> (DMA/RDMA), using the categorization from [1]. That means that it's
>>>> time to convert the get_user_pages_fast() + put_page() calls to
>>>> pin_user_pages_fast() + unpin_user_page() calls.
>>>
>>> You are saying that the page is used for DIO and not DMA, but it sure
>>> looks to me like it is used for DMA.
>>
>> No, I was referring to "Case 2" scenario in change log which means it is
>> used for DMA, not DIO.
Hi,
Dan, I also uncertain as to how you read this as referring to DIO. Case 2 is
DMA or RDMA, and in fact the proposed commit log says both of those things:
Case 2 and DMA/RDMA. I don't see "DIO" anywhere here...
>>
>>>
>>> 503 /* Map the page into DMA space. */
>>> 504 ptes[i].dma_addr =
>>> 505 dma_map_page(pg_tbl->device, page, 0, PAGE_SIZE,
>>> 506 DMA_BIDIRECTIONAL);
>>>
>>> To be honest, that starting paragraph was confusing. At first I thought
>>> you were saying gasket was an RDMA driver. :P I shouldn't have to read
>>> a different document to understand the commit message. It should be
>>> summarized enough and the other documentation is supplemental.
>>>
>>> "In 2019 we introduced pin_user_pages() and now we are converting
>>> get_user_pages() to the new API as appropriate".
>>
>> As all other similar conversion have similar change logs, so I was trying
>> to maintain the same. John might have a different opinion on this.
>
> For example, I was referring to few recent similar commits for change logs.
>
> http://lkml.kernel.org/r/[email protected]
> https://lore.kernel.org/r/[email protected]
>
>
>>
>> John, Any further opinion ??
Well, I've gotten away with the current wording for quite a few patches so
far, but that sure doesn't mean it's perfect! :)
Maybe adding the words that Dan suggests, above, will suffice? Here:
>>> "In 2019 we introduced pin_user_pages() and now we are converting
>>> get_user_pages() to the new API as appropriate".
thanks,
--
John Hubbard
NVIDIA
On Fri, May 29, 2020 at 11:57:09AM +0530, Souptick Joarder wrote:
> On Fri, May 29, 2020 at 11:46 AM Souptick Joarder <[email protected]> wrote:
> >
> > On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
> > >
> > > On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
> > > > This code was using get_user_pages_fast(), in a "Case 2" scenario
> > > > (DMA/RDMA), using the categorization from [1]. That means that it's
> > > > time to convert the get_user_pages_fast() + put_page() calls to
> > > > pin_user_pages_fast() + unpin_user_page() calls.
> > >
> > > You are saying that the page is used for DIO and not DMA, but it sure
> > > looks to me like it is used for DMA.
> >
> > No, I was referring to "Case 2" scenario in change log which means it is
> > used for DMA, not DIO.
You can't use pin_user_pages() for DMA. This was second reason that I
was confused.
mm/gup.c
2863 /**
2864 * pin_user_pages_fast() - pin user pages in memory without taking locks
2865 *
2866 * @start: starting user address
2867 * @nr_pages: number of pages from start to pin
2868 * @gup_flags: flags modifying pin behaviour
2869 * @pages: array that receives pointers to the pages pinned.
2870 * Should be at least nr_pages long.
2871 *
2872 * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See
2873 * get_user_pages_fast() for documentation on the function arguments, because
2874 * the arguments here are identical.
2875 *
2876 * FOLL_PIN means that the pages must be released via unpin_user_page(). Please
2877 * see Documentation/core-api/pin_user_pages.rst for further details.
2878 *
2879 * This is intended for Case 1 (DIO) in Documentation/core-api/pin_user_pages.rst. It
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2880 * is NOT intended for Case 2 (RDMA: long-term pins).
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2881 */
2882 int pin_user_pages_fast(unsigned long start, int nr_pages,
2883 unsigned int gup_flags, struct page **pages)
2884 {
2885 /* FOLL_GET and FOLL_PIN are mutually exclusive. */
2886 if (WARN_ON_ONCE(gup_flags & FOLL_GET))
2887 return -EINVAL;
2888
2889 gup_flags |= FOLL_PIN;
2890 return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
2891 }
2892 EXPORT_SYMBOL_GPL(pin_user_pages_fast);
regards,
dan carpenter
On Fri, May 29, 2020 at 12:38:20AM -0700, John Hubbard wrote:
> On 2020-05-28 23:27, Souptick Joarder wrote:
> > On Fri, May 29, 2020 at 11:46 AM Souptick Joarder <[email protected]> wrote:
> > >
> > > On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
> > > >
> > > > On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
> > > > > This code was using get_user_pages_fast(), in a "Case 2" scenario
> > > > > (DMA/RDMA), using the categorization from [1]. That means that it's
> > > > > time to convert the get_user_pages_fast() + put_page() calls to
> > > > > pin_user_pages_fast() + unpin_user_page() calls.
> > > >
> > > > You are saying that the page is used for DIO and not DMA, but it sure
> > > > looks to me like it is used for DMA.
> > >
> > > No, I was referring to "Case 2" scenario in change log which means it is
> > > used for DMA, not DIO.
>
> Hi,
>
> Dan, I also uncertain as to how you read this as referring to DIO. Case 2 is
> DMA or RDMA, and in fact the proposed commit log says both of those things:
> Case 2 and DMA/RDMA. I don't see "DIO" anywhere here...
I thought he meant that the original code was appropriate for DMA and he
was fixing it. :P
regards,
dan carpenter
On 2020-05-29 00:46, Dan Carpenter wrote:
> On Fri, May 29, 2020 at 11:57:09AM +0530, Souptick Joarder wrote:
>> On Fri, May 29, 2020 at 11:46 AM Souptick Joarder <[email protected]> wrote:
>>>
>>> On Thu, May 28, 2020 at 4:34 PM Dan Carpenter <[email protected]> wrote:
>>>>
>>>> On Thu, May 28, 2020 at 02:32:42AM +0530, Souptick Joarder wrote:
>>>>> This code was using get_user_pages_fast(), in a "Case 2" scenario
>>>>> (DMA/RDMA), using the categorization from [1]. That means that it's
>>>>> time to convert the get_user_pages_fast() + put_page() calls to
>>>>> pin_user_pages_fast() + unpin_user_page() calls.
>>>>
>>>> You are saying that the page is used for DIO and not DMA, but it sure
>>>> looks to me like it is used for DMA.
>>>
>>> No, I was referring to "Case 2" scenario in change log which means it is
>>> used for DMA, not DIO.
>
> You can't use pin_user_pages() for DMA. This was second reason that I
> was confused.
OK, now it is getting interesting!
>
> mm/gup.c
> 2863 /**
> 2864 * pin_user_pages_fast() - pin user pages in memory without taking locks
> 2865 *
> 2866 * @start: starting user address
> 2867 * @nr_pages: number of pages from start to pin
> 2868 * @gup_flags: flags modifying pin behaviour
> 2869 * @pages: array that receives pointers to the pages pinned.
> 2870 * Should be at least nr_pages long.
> 2871 *
> 2872 * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See
> 2873 * get_user_pages_fast() for documentation on the function arguments, because
> 2874 * the arguments here are identical.
> 2875 *
> 2876 * FOLL_PIN means that the pages must be released via unpin_user_page(). Please
> 2877 * see Documentation/core-api/pin_user_pages.rst for further details.
> 2878 *
> 2879 * This is intended for Case 1 (DIO) in Documentation/core-api/pin_user_pages.rst. It
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 2880 * is NOT intended for Case 2 (RDMA: long-term pins).
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I'm trying to figure out why I wrote that. It seems just wrong, because once the
page is dma-pinned, it will work just fine for either Case 1 or Case 2. hmmm, I
think this was from a few design ideas ago, when we were still working through the
FOLL_LONGTERM and FOLL_PIN thoughts and how the pin_user_pages*() API set should
look.
At this point, it's looking very much like a (my) documentation bug: all 4 of the
"intended for Case 1 (DIO)" comments in mm/gup.c probably need to be simply deleted.
Good catch.
> 2881 */
> 2882 int pin_user_pages_fast(unsigned long start, int nr_pages,
> 2883 unsigned int gup_flags, struct page **pages)
> 2884 {
> 2885 /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> 2886 if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> 2887 return -EINVAL;
> 2888
> 2889 gup_flags |= FOLL_PIN;
> 2890 return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
> 2891 }
> 2892 EXPORT_SYMBOL_GPL(pin_user_pages_fast);
>
> regards,
> dan carpenter
>
thanks,
--
John Hubbard
NVIDIA
Anyway, can you resend with the commit message re-written. To me the
information that's most useful is from the lwn article:
"In short, if pages are being pinned for access to the data
contained within those pages, pin_user_pages() should be used. For
cases where the intent is to manipulate the page structures
corresponding to the pages rather than the data within them,
get_user_pages() is the correct interface."
What are the runtime implications of this patch? I'm still not clear on
that honestly.
When I'm reviewing patches, I also want to know how a bug was
introduced. In this case the original author did everything correctly
but we've just added some new features (cleanups. whatever).
I did skim the LWN article back in December but I don't remember the
details so I really want all this stuff re-stated in each commit
message.
regards,
dan carpenter
On 2020-05-29 04:53, Dan Carpenter wrote:
...
> What are the runtime implications of this patch? I'm still not clear on
> that honestly.
Instead of incrementing each page's refcount by 1 (with get_user_pages()),
pin_user_pages*() will increment by GUP_PIN_COUNTING_BIAS, which is 1024.
That by itself should not have any performance impact, of course, but
there's a couple more things:
For compound pages of more than 2 page size, it will also increment
a separate struct page's field, via hpage_pincount_add().
And finally, it will update /proc/vmstat counters on pin and unpin, via
the optimized mod_node_page_state() call.
So it's expected to be very light. And, for DMA (as opposed to DIO)
situations, the DMA setup time is inevitably much greater than any of
the above overheads, so I expect that this patch will be completely
invisible from a performance point of view.
It would be a "nice to have", though, if anyone were able to do a
performance comparison on the gasket driver for this patch, and/or
basic runtime verification, since I'm sure it's a specialized setup.
thanks,
--
John Hubbard
NVIDIA