This issue of mapcount in hugetlb pages referenced by shared PMDs was
discussed in [1]. The following two patches address user visible
behavior caused by this issue.
Patches apply to mm-stable as they can also target stable backports.
Ongoing folio conversions cause context conflicts in the second patch
when applied to mm-unstable/linux-next. I can create separate patch(es)
if people agree with these.
[1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
Mike Kravetz (2):
mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
migrate: hugetlb: Check for hugetlb shared PMD in node migration
fs/proc/task_mmu.c | 10 ++++++++--
include/linux/hugetlb.h | 12 ++++++++++++
mm/mempolicy.c | 3 ++-
3 files changed, 22 insertions(+), 3 deletions(-)
--
2.39.1
On Thu, 26 Jan 2023 14:27:19 -0800 Mike Kravetz <[email protected]> wrote:
> Ongoing folio conversions cause context conflicts in the second patch
> when applied to mm-unstable/linux-next. I can create separate patch(es)
> if people agree with these.
I fixed things up. queue_folios_hugetlb() is now
static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long end,
struct mm_walk *walk)
{
int ret = 0;
#ifdef CONFIG_HUGETLB_PAGE
struct queue_pages *qp = walk->private;
unsigned long flags = (qp->flags & MPOL_MF_VALID);
struct folio *folio;
spinlock_t *ptl;
pte_t entry;
ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte);
entry = huge_ptep_get(pte);
if (!pte_present(entry))
goto unlock;
folio = pfn_folio(pte_pfn(entry));
if (!queue_folio_required(folio, qp))
goto unlock;
if (flags == MPOL_MF_STRICT) {
/*
* STRICT alone means only detecting misplaced folio and no
* need to further check other vma.
*/
ret = -EIO;
goto unlock;
}
if (!vma_migratable(walk->vma)) {
/*
* Must be STRICT with MOVE*, otherwise .test_walk() have
* stopped walking current vma.
* Detecting misplaced folio but allow migrating folios which
* have been queued.
*/
ret = 1;
goto unlock;
}
/*
* With MPOL_MF_MOVE, we try to migrate only unshared folios. If it
* is shared it is likely not worth migrating.
*
* To check if the folio is shared, ideally we want to make sure
* every page is mapped to the same process. Doing that is very
* expensive, so check the estimated mapcount of the folio instead.
*/
if (flags & (MPOL_MF_MOVE_ALL) ||
(flags & MPOL_MF_MOVE && folio_estimated_mapcount(folio) == 1 &&
!hugetlb_pmd_shared(pte))) {
if (isolate_hugetlb(folio, qp->pagelist) &&
(flags & MPOL_MF_STRICT))
/*
* Failed to isolate folio but allow migrating pages
* which have been queued.
*/
ret = 1;
}
unlock:
spin_unlock(ptl);
#else
BUG();
#endif
return ret;
}
On Thu, Jan 26, 2023 at 02:27:19PM -0800, Mike Kravetz wrote:
> This issue of mapcount in hugetlb pages referenced by shared PMDs was
> discussed in [1]. The following two patches address user visible
> behavior caused by this issue.
>
> Patches apply to mm-stable as they can also target stable backports.
>
> Ongoing folio conversions cause context conflicts in the second patch
> when applied to mm-unstable/linux-next. I can create separate patch(es)
> if people agree with these.
>
> [1] https://lore.kernel.org/linux-mm/Y9BF+OCdWnCSilEu@monkey/
> Mike Kravetz (2):
> mm: hugetlb: proc: check for hugetlb shared PMD in /proc/PID/smaps
> migrate: hugetlb: Check for hugetlb shared PMD in node migration
Acked-by: Peter Xu <[email protected]>
--
Peter Xu