Hey,
While looking at ZONE_DEVICE struct page reuse particularly the last
patch[0], I found two possible improvements for follow_hugetlb_page()
which is solely used for get_user_pages()/pin_user_pages().
The first patch batches page refcount updates while the second tidies
up storing the subpages/vmas. Both together bring the cost of slow
variant of gup() cost from ~87.6k usecs to ~5.8k usecs.
libhugetlbfs tests seem to pass as well gup_test benchmarks
with hugetlbfs vmas.
v2:
* switch from refs++ to ++refs;
* add Mike's Rb on patch 1;
* switch from page++ to mem_map_offset() on the second patch;
[0] https://lore.kernel.org/linux-mm/[email protected]/
Joao Martins (2):
mm/hugetlb: grab head page refcount once for group of subpages
mm/hugetlb: refactor subpage recording
include/linux/mm.h | 3 +++
mm/gup.c | 5 ++--
mm/hugetlb.c | 66 +++++++++++++++++++++++++++-------------------
3 files changed, 44 insertions(+), 30 deletions(-)
--
2.17.1