2024-06-13 07:39:42

by Oscar Salvador

[permalink] [raw]
Subject: Re: [PATCH v5 16/18] powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD

On Mon, Jun 10, 2024 at 07:55:01AM +0200, Christophe Leroy wrote:
> On book3s/64, the only user of hugepd is hash in 4k mode.
>
> All other setups (hash-64, radix-4, radix-64) use leaf PMD/PUD.
>
> Rework hash-4k to use contiguous PMD and PUD instead.
>
> In that setup there are only two huge page sizes: 16M and 16G.
>
> 16M sits at PMD level and 16G at PUD level.
>
> pte_update doesn't know page size, lets use the same trick as
> hpte_need_flush() to get page size from segment properties. That's
> not the most efficient way but let's do that until callers of
> pte_update() provide page size instead of just a huge flag.
>
> Signed-off-by: Christophe Leroy <[email protected]>
> ---
...
> +static inline unsigned long hash__pte_update(struct mm_struct *mm,
> + unsigned long addr,
> + pte_t *ptep, unsigned long clr,
> + unsigned long set,
> + int huge)
> +{
> + unsigned long old;
> +
> + old = hash__pte_update_one(ptep, clr, set);
> +
> + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && huge) {
> + unsigned int psize = get_slice_psize(mm, addr);
> + int nb, i;
> +
> + if (psize == MMU_PAGE_16M)
> + nb = SZ_16M / PMD_SIZE;
> + else if (psize == MMU_PAGE_16G)
> + nb = SZ_16G / PUD_SIZE;
> + else
> + nb = 1;

Although that might be a safe default, it might carry consequences down the road?
It might not, but if we reach that, something went wrong, so I would put a
WARN_ON_ONCE at least.

> --- a/arch/powerpc/mm/book3s64/hugetlbpage.c
> +++ b/arch/powerpc/mm/book3s64/hugetlbpage.c
> @@ -53,6 +53,16 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> /* If PTE permissions don't match, take page fault */
> if (unlikely(!check_pte_access(access, old_pte)))
> return 1;
> + /*
> + * If hash-4k, hugepages use seeral contiguous PxD entries
> + * so bail out and let mm make the page young or dirty
> + */
> + if (IS_ENABLED(CONFIG_PPC_4K_PAGES)) {
> + if (!(old_pte & _PAGE_ACCESSED))
> + return 1;
> + if ((access & _PAGE_WRITE) && !(old_pte & _PAGE_DIRTY))
> + return 1;
> + }

You mentioned that we need to bail out otherwise only the first PxD would be
updated.
In the comment you say that mm will take care of making the page young
or dirty.
Does this mean that the PxDs underneath will not have its bits updated?


--
Oscar Salvador
SUSE Labs