We used to determine the number of page table entries to set for a NAPOT
hugepage by using the pte value which actually fails when the pte to set is
a swap entry.
So take advantage of a recent fix for arm64 reported in [1] which
introduces the size of the mapping as an argument of set_huge_pte_at(): we
can then use this size to compute the number of page table entries to set
for a NAPOT region.
Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
Reported-by: Ryan Roberts <[email protected]>
Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
Signed-off-by: Alexandre Ghiti <[email protected]>
---
arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
index e4a2ace92dbe..b52f0210481f 100644
--- a/arch/riscv/mm/hugetlbpage.c
+++ b/arch/riscv/mm/hugetlbpage.c
@@ -183,15 +183,22 @@ void set_huge_pte_at(struct mm_struct *mm,
pte_t pte,
unsigned long sz)
{
+ unsigned long hugepage_shift;
int i, pte_num;
- if (!pte_napot(pte)) {
- set_pte_at(mm, addr, ptep, pte);
- return;
- }
+ if (sz >= PGDIR_SIZE)
+ hugepage_shift = PGDIR_SHIFT;
+ else if (sz >= P4D_SIZE)
+ hugepage_shift = P4D_SHIFT;
+ else if (sz >= PUD_SIZE)
+ hugepage_shift = PUD_SHIFT;
+ else if (sz >= PMD_SIZE)
+ hugepage_shift = PMD_SHIFT;
+ else
+ hugepage_shift = PAGE_SHIFT;
- pte_num = napot_pte_num(napot_cont_order(pte));
- for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
+ pte_num = sz >> hugepage_shift;
+ for (i = 0; i < pte_num; i++, ptep++, addr += (1 << hugepage_shift))
set_pte_at(mm, addr, ptep, pte);
}
--
2.39.2
On Thu, Sep 28, 2023 at 05:18:46PM +0200, Alexandre Ghiti wrote:
> We used to determine the number of page table entries to set for a NAPOT
> hugepage by using the pte value which actually fails when the pte to set is
> a swap entry.
>
> So take advantage of a recent fix for arm64 reported in [1] which
> introduces the size of the mapping as an argument of set_huge_pte_at(): we
> can then use this size to compute the number of page table entries to set
> for a NAPOT region.
>
> Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
> Reported-by: Ryan Roberts <[email protected]>
> Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
> Signed-off-by: Alexandre Ghiti <[email protected]>
Breaks the build. Your $subject marks this for -fixes, but this will not
build there, as it relies on content that's not yet in that branch.
AFAICT, you're going to have to resend this with akpm on CC, as the
dependency is in his tree...
Thanks,
Conor.
> ---
> arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index e4a2ace92dbe..b52f0210481f 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -183,15 +183,22 @@ void set_huge_pte_at(struct mm_struct *mm,
> pte_t pte,
> unsigned long sz)
> {
> + unsigned long hugepage_shift;
> int i, pte_num;
>
> - if (!pte_napot(pte)) {
> - set_pte_at(mm, addr, ptep, pte);
> - return;
> - }
> + if (sz >= PGDIR_SIZE)
> + hugepage_shift = PGDIR_SHIFT;
> + else if (sz >= P4D_SIZE)
> + hugepage_shift = P4D_SHIFT;
> + else if (sz >= PUD_SIZE)
> + hugepage_shift = PUD_SHIFT;
> + else if (sz >= PMD_SIZE)
> + hugepage_shift = PMD_SHIFT;
> + else
> + hugepage_shift = PAGE_SHIFT;
>
> - pte_num = napot_pte_num(napot_cont_order(pte));
> - for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
> + pte_num = sz >> hugepage_shift;
> + for (i = 0; i < pte_num; i++, ptep++, addr += (1 << hugepage_shift))
> set_pte_at(mm, addr, ptep, pte);
> }
>
> --
> 2.39.2
>
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Hi Conor,
On 30/09/2023 11:14, Conor Dooley wrote:
> On Thu, Sep 28, 2023 at 05:18:46PM +0200, Alexandre Ghiti wrote:
>> We used to determine the number of page table entries to set for a NAPOT
>> hugepage by using the pte value which actually fails when the pte to set is
>> a swap entry.
>>
>> So take advantage of a recent fix for arm64 reported in [1] which
>> introduces the size of the mapping as an argument of set_huge_pte_at(): we
>> can then use this size to compute the number of page table entries to set
>> for a NAPOT region.
>>
>> Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
>> Reported-by: Ryan Roberts <[email protected]>
>> Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
>> Signed-off-by: Alexandre Ghiti <[email protected]>
> Breaks the build. Your $subject marks this for -fixes, but this will not
> build there, as it relies on content that's not yet in that branch.
> AFAICT, you're going to have to resend this with akpm on CC, as the
> dependency is in his tree...
I see, but I still don't understand why -fixes does not point to the
latest rcX instead of staying on rc1? The patch which this series
depends on just made it to rc4.
Thanks,
Alex
> Thanks,
> Conor.
>
>> ---
>> arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
>> index e4a2ace92dbe..b52f0210481f 100644
>> --- a/arch/riscv/mm/hugetlbpage.c
>> +++ b/arch/riscv/mm/hugetlbpage.c
>> @@ -183,15 +183,22 @@ void set_huge_pte_at(struct mm_struct *mm,
>> pte_t pte,
>> unsigned long sz)
>> {
>> + unsigned long hugepage_shift;
>> int i, pte_num;
>>
>> - if (!pte_napot(pte)) {
>> - set_pte_at(mm, addr, ptep, pte);
>> - return;
>> - }
>> + if (sz >= PGDIR_SIZE)
>> + hugepage_shift = PGDIR_SHIFT;
>> + else if (sz >= P4D_SIZE)
>> + hugepage_shift = P4D_SHIFT;
>> + else if (sz >= PUD_SIZE)
>> + hugepage_shift = PUD_SHIFT;
>> + else if (sz >= PMD_SIZE)
>> + hugepage_shift = PMD_SHIFT;
>> + else
>> + hugepage_shift = PAGE_SHIFT;
>>
>> - pte_num = napot_pte_num(napot_cont_order(pte));
>> - for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
>> + pte_num = sz >> hugepage_shift;
>> + for (i = 0; i < pte_num; i++, ptep++, addr += (1 << hugepage_shift))
>> set_pte_at(mm, addr, ptep, pte);
>> }
>>
>> --
>> 2.39.2
>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>>
>> _______________________________________________
>> linux-riscv mailing list
>> [email protected]
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
On Mon, Oct 02, 2023 at 09:18:52AM +0200, Alexandre Ghiti wrote:
> Hi Conor,
>
> On 30/09/2023 11:14, Conor Dooley wrote:
> > On Thu, Sep 28, 2023 at 05:18:46PM +0200, Alexandre Ghiti wrote:
> > > We used to determine the number of page table entries to set for a NAPOT
> > > hugepage by using the pte value which actually fails when the pte to set is
> > > a swap entry.
> > >
> > > So take advantage of a recent fix for arm64 reported in [1] which
> > > introduces the size of the mapping as an argument of set_huge_pte_at(): we
> > > can then use this size to compute the number of page table entries to set
> > > for a NAPOT region.
> > >
> > > Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
> > > Reported-by: Ryan Roberts <[email protected]>
> > > Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
> > > Signed-off-by: Alexandre Ghiti <[email protected]>
> > Breaks the build. Your $subject marks this for -fixes, but this will not
> > build there, as it relies on content that's not yet in that branch.
> > AFAICT, you're going to have to resend this with akpm on CC, as the
> > dependency is in his tree...
>
>
> I see, but I still don't understand why -fixes does not point to the latest
> rcX instead of staying on rc1?
It's up to Palmer what he does with his fixes branch, but two thoughts.
Doing what you suggest would require rebasing things not yet sent to Linus
every week and fast-forwarding when PRs are actually merged.
IIRC, Palmer used to do something like the latter, but IIRC he got some
complaints about that and switched to the current method.
At the very least, you should point out dependencies like this, as I
figure an individual patch could be applied on top of -rc4 and merged
in. Both Palmer and I have submitted things for b4 to improve support for
doing things exactly like this ;)
> The patch which this series depends on just made it to rc4.
However, if you do not mention what the deps for your patches are
explicitly, how are people supposed to know? The reference to the
dependency makes it look like a report for a similar problem that also
applies to riscv, not a pre-requisite for the patch.
Thanks,
Conor.
On Thu, Sep 28, 2023 at 05:18:46PM +0200, Alexandre Ghiti wrote:
> We used to determine the number of page table entries to set for a NAPOT
> hugepage by using the pte value which actually fails when the pte to set is
> a swap entry.
>
> So take advantage of a recent fix for arm64 reported in [1] which
> introduces the size of the mapping as an argument of set_huge_pte_at(): we
> can then use this size to compute the number of page table entries to set
> for a NAPOT region.
>
> Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
> Reported-by: Ryan Roberts <[email protected]>
> Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
> Signed-off-by: Alexandre Ghiti <[email protected]>
> ---
> arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c
> index e4a2ace92dbe..b52f0210481f 100644
> --- a/arch/riscv/mm/hugetlbpage.c
> +++ b/arch/riscv/mm/hugetlbpage.c
> @@ -183,15 +183,22 @@ void set_huge_pte_at(struct mm_struct *mm,
> pte_t pte,
> unsigned long sz)
> {
> + unsigned long hugepage_shift;
> int i, pte_num;
>
> - if (!pte_napot(pte)) {
> - set_pte_at(mm, addr, ptep, pte);
> - return;
> - }
> + if (sz >= PGDIR_SIZE)
> + hugepage_shift = PGDIR_SHIFT;
> + else if (sz >= P4D_SIZE)
> + hugepage_shift = P4D_SHIFT;
> + else if (sz >= PUD_SIZE)
> + hugepage_shift = PUD_SHIFT;
> + else if (sz >= PMD_SIZE)
> + hugepage_shift = PMD_SHIFT;
> + else
> + hugepage_shift = PAGE_SHIFT;
>
> - pte_num = napot_pte_num(napot_cont_order(pte));
> - for (i = 0; i < pte_num; i++, ptep++, addr += PAGE_SIZE)
> + pte_num = sz >> hugepage_shift;
> + for (i = 0; i < pte_num; i++, ptep++, addr += (1 << hugepage_shift))
> set_pte_at(mm, addr, ptep, pte);
> }
>
So a 64k napot, for example, will fall into the PAGE_SHIFT arm, but then
we'll calculate 16 for pte_num. Looks good to me.
Reviewed-by: Andrew Jones <[email protected]>
Thanks,
drew
Hey Conor,
On 02/10/2023 15:11, Conor Dooley wrote:
> On Mon, Oct 02, 2023 at 09:18:52AM +0200, Alexandre Ghiti wrote:
>> Hi Conor,
>>
>> On 30/09/2023 11:14, Conor Dooley wrote:
>>> On Thu, Sep 28, 2023 at 05:18:46PM +0200, Alexandre Ghiti wrote:
>>>> We used to determine the number of page table entries to set for a NAPOT
>>>> hugepage by using the pte value which actually fails when the pte to set is
>>>> a swap entry.
>>>>
>>>> So take advantage of a recent fix for arm64 reported in [1] which
>>>> introduces the size of the mapping as an argument of set_huge_pte_at(): we
>>>> can then use this size to compute the number of page table entries to set
>>>> for a NAPOT region.
>>>>
>>>> Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page")
>>>> Reported-by: Ryan Roberts <[email protected]>
>>>> Closes: https://lore.kernel.org/linux-arm-kernel/[email protected]/ [1]
>>>> Signed-off-by: Alexandre Ghiti <[email protected]>
>>> Breaks the build. Your $subject marks this for -fixes, but this will not
>>> build there, as it relies on content that's not yet in that branch.
>>> AFAICT, you're going to have to resend this with akpm on CC, as the
>>> dependency is in his tree...
>>
>> I see, but I still don't understand why -fixes does not point to the latest
>> rcX instead of staying on rc1?
> It's up to Palmer what he does with his fixes branch, but two thoughts.
> Doing what you suggest would require rebasing things not yet sent to Linus
> every week and fast-forwarding when PRs are actually merged.
> IIRC, Palmer used to do something like the latter, but IIRC he got some
> complaints about that and switched to the current method.
> At the very least, you should point out dependencies like this, as I
> figure an individual patch could be applied on top of -rc4 and merged
> in. Both Palmer and I have submitted things for b4 to improve support for
> doing things exactly like this ;)
>
>> The patch which this series depends on just made it to rc4.
> However, if you do not mention what the deps for your patches are
> explicitly, how are people supposed to know? The reference to the
> dependency makes it look like a report for a similar problem that also
> applies to riscv, not a pre-requisite for the patch.
You're right, I saw the dependency being merged so I thought it would be
ok but I should have mention it. I have just discussed with Palmer, and
I'll +cc Andrew to see if he can take that in his tree.
Thanks!
Alex
>
> Thanks,
> Conor.
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv