A recent report [1] from Ryan for arm64 revealed that we do not handle
swap entries when setting a hugepage backed by a NAPOT region (the
contpte riscv equivalent).
As explained in [1], the issue was discovered by a new test in kselftest
which uses poison entries, but the symptoms are different from arm64 though:
- the riscv kernel bugs because we do not handle VM_FAULT_HWPOISON*,
this is fixed by patch 1,
- after that, the test passes because the first pte_napot() fails (the
poison entry does not have the N bit set), and then we only set the
first page table entry covering the NAPOT hugepage, which is enough
for hugetlb_fault() to correctly raise a VM_FAULT_HWPOISON wherever we
write in this mapping since only this first page table entry is
checked
(see https://elixir.bootlin.com/linux/v6.6-rc3/source/mm/hugetlb.c#L6071).
But this seems fragile so patch 2 sets all page table entries of a
NAPOT mapping.
[1]: https://lore.kernel.org/linux-arm-kernel/[email protected]/
Alexandre Ghiti (2):
riscv: Handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of
panicking
riscv: Fix set_huge_pte_at() for NAPOT mappings when a swap entry is
set
arch/riscv/mm/fault.c | 2 +-
arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
2 files changed, 14 insertions(+), 7 deletions(-)
--
2.39.2
+cc Andrew: Would you mind taking this patchset in your tree for the
next rc? This patchset depends on a previous fix for arm64 that you
merged in rc4 which is not in the riscv -fixes branch yet.
I saw with Palmer and he should ack this shortly.
If I can do anything else to help, let me know.
Thanks,
Alex
On 28/09/2023 17:18, Alexandre Ghiti wrote:
> A recent report [1] from Ryan for arm64 revealed that we do not handle
> swap entries when setting a hugepage backed by a NAPOT region (the
> contpte riscv equivalent).
>
> As explained in [1], the issue was discovered by a new test in kselftest
> which uses poison entries, but the symptoms are different from arm64 though:
>
> - the riscv kernel bugs because we do not handle VM_FAULT_HWPOISON*,
> this is fixed by patch 1,
> - after that, the test passes because the first pte_napot() fails (the
> poison entry does not have the N bit set), and then we only set the
> first page table entry covering the NAPOT hugepage, which is enough
> for hugetlb_fault() to correctly raise a VM_FAULT_HWPOISON wherever we
> write in this mapping since only this first page table entry is
> checked
> (see https://elixir.bootlin.com/linux/v6.6-rc3/source/mm/hugetlb.c#L6071).
> But this seems fragile so patch 2 sets all page table entries of a
> NAPOT mapping.
>
> [1]: https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
>
> Alexandre Ghiti (2):
> riscv: Handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of
> panicking
> riscv: Fix set_huge_pte_at() for NAPOT mappings when a swap entry is
> set
>
> arch/riscv/mm/fault.c | 2 +-
> arch/riscv/mm/hugetlbpage.c | 19 +++++++++++++------
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
> +cc Andrew: Would? you mind taking this patchset in your tree for the
> next rc? This patchset depends on a previous fix for arm64 that you
> merged in rc4 which is not in the riscv -fixes branch yet.
>
> I saw with Palmer and he should ack this shortly.
Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
being well I'll move them into mm-hotfixes-stable within a week then
into Linus shortly after.
Unless something changes. It's odd that the riscv tree(s) aren't set
up to merge fixes against -rc4?
On Tue, 03 Oct 2023 09:04:43 PDT (-0700), [email protected] wrote:
> On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
>
>> +cc Andrew: Would you mind taking this patchset in your tree for the
>> next rc? This patchset depends on a previous fix for arm64 that you
>> merged in rc4 which is not in the riscv -fixes branch yet.
>>
>> I saw with Palmer and he should ack this shortly.
>
> Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
> being well I'll move them into mm-hotfixes-stable within a week then
> into Linus shortly after.
>
> Unless something changes. It's odd that the riscv tree(s) aren't set
> up to merge fixes against -rc4?
It's mostly that I have COVID, so everything's kind of a mess right now.
Acked-by: Palmer Dabbelt <[email protected]>
Thanks!
Hi Andrew,
On Tue, Oct 3, 2023 at 6:04 PM Andrew Morton <[email protected]> wrote:
>
> On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
>
> > +cc Andrew: Would you mind taking this patchset in your tree for the
> > next rc? This patchset depends on a previous fix for arm64 that you
> > merged in rc4 which is not in the riscv -fixes branch yet.
> >
> > I saw with Palmer and he should ack this shortly.
>
> Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
> being well I'll move them into mm-hotfixes-stable within a week then
> into Linus shortly after.
Those fixes finally did not make it to 6.6, I was hoping for them to
land in -rc6 or -rc7: for the future, what should have I done to avoid
that?
Thanks,
Alex
>
> Unless something changes. It's odd that the riscv tree(s) aren't set
> up to merge fixes against -rc4?
On Thu, 26 Oct 2023 10:57:27 +0200 Alexandre Ghiti <[email protected]> wrote:
> On Tue, Oct 3, 2023 at 6:04 PM Andrew Morton <[email protected]> wrote:
> >
> > On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
> >
> > > +cc Andrew: Would you mind taking this patchset in your tree for the
> > > next rc? This patchset depends on a previous fix for arm64 that you
> > > merged in rc4 which is not in the riscv -fixes branch yet.
> > >
> > > I saw with Palmer and he should ack this shortly.
> >
> > Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
> > being well I'll move them into mm-hotfixes-stable within a week then
> > into Linus shortly after.
>
> Those fixes finally did not make it to 6.6, I was hoping for them to
> land in -rc6 or -rc7: for the future, what should have I done to avoid
> that?
They're merged in Linus's tree.
6f1bace9a9fb arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
935d4f0c6dc8 mm: hugetlb: add huge page size param to set_huge_pte_at()
n 26/10/2023 16:13, Andrew Morton wrote:
> On Thu, 26 Oct 2023 10:57:27 +0200 Alexandre Ghiti <[email protected]> wrote:
>
>> On Tue, Oct 3, 2023 at 6:04 PM Andrew Morton <[email protected]> wrote:
>>> On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
>>>
>>>> +cc Andrew: Would you mind taking this patchset in your tree for the
>>>> next rc? This patchset depends on a previous fix for arm64 that you
>>>> merged in rc4 which is not in the riscv -fixes branch yet.
>>>>
>>>> I saw with Palmer and he should ack this shortly.
>>> Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
>>> being well I'll move them into mm-hotfixes-stable within a week then
>>> into Linus shortly after.
>> Those fixes finally did not make it to 6.6, I was hoping for them to
>> land in -rc6 or -rc7: for the future, what should have I done to avoid
>> that?
> They're merged in Linus's tree.
>
> 6f1bace9a9fb arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
> 935d4f0c6dc8 mm: hugetlb: add huge page size param to set_huge_pte_at()
Oops, sorry, I missed them this morning!
Thanks,
Alex
On 26/10/2023 15:15, Alexandre Ghiti wrote:
> n 26/10/2023 16:13, Andrew Morton wrote:
>> On Thu, 26 Oct 2023 10:57:27 +0200 Alexandre Ghiti <[email protected]>
>> wrote:
>>
>>> On Tue, Oct 3, 2023 at 6:04 PM Andrew Morton <[email protected]> wrote:
>>>> On Tue, 3 Oct 2023 17:43:10 +0200 Alexandre Ghiti <[email protected]> wrote:
>>>>
>>>>> +cc Andrew: Would you mind taking this patchset in your tree for the
>>>>> next rc? This patchset depends on a previous fix for arm64 that you
>>>>> merged in rc4 which is not in the riscv -fixes branch yet.
>>>>>
>>>>> I saw with Palmer and he should ack this shortly.
>>>> Well I grabbed them into mm.git's mm-hotfixes-unstable queue. All
>>>> being well I'll move them into mm-hotfixes-stable within a week then
>>>> into Linus shortly after.
>>> Those fixes finally did not make it to 6.6, I was hoping for them to
>>> land in -rc6 or -rc7: for the future, what should have I done to avoid
>>> that?
>> They're merged in Linus's tree.
>>
>> 6f1bace9a9fb arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
>> 935d4f0c6dc8 mm: hugetlb: add huge page size param to set_huge_pte_at()
>
>
> Oops, sorry, I missed them this morning!
Those two patches that Andrew highlights are the fix I did for arm64. Weren't
you referring to the corresponding patches you did for riscv, Alex?
>
> Thanks,
>
> Alex
>
On Thu, 26 Oct 2023 15:30:44 +0100 Ryan Roberts <[email protected]> wrote:
> >>> Those fixes finally did not make it to 6.6, I was hoping for them to
> >>> land in -rc6 or -rc7: for the future, what should have I done to avoid
> >>> that?
> >> They're merged in Linus's tree.
> >>
> >> 6f1bace9a9fb arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
> >> 935d4f0c6dc8 mm: hugetlb: add huge page size param to set_huge_pte_at()
> >
> >
> > Oops, sorry, I missed them this morning!
>
> Those two patches that Andrew highlights are the fix I did for arm64. Weren't
> you referring to the corresponding patches you did for riscv, Alex?
These are in mainline:
1de195dd0e05 riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set
117b1bb0cbc7 riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking
I'm not sure what happened to your "riscv: hugetlb: convert
set_huge_pte_at() to take vma" - perhaps it was updated.
On 26/10/2023 15:54, Andrew Morton wrote:
> On Thu, 26 Oct 2023 15:30:44 +0100 Ryan Roberts <[email protected]> wrote:
>
>>>>> Those fixes finally did not make it to 6.6, I was hoping for them to
>>>>> land in -rc6 or -rc7: for the future, what should have I done to avoid
>>>>> that?
>>>> They're merged in Linus's tree.
>>>>
>>>> 6f1bace9a9fb arm64: hugetlb: fix set_huge_pte_at() to work with all swap entries
>>>> 935d4f0c6dc8 mm: hugetlb: add huge page size param to set_huge_pte_at()
>>>
>>>
>>> Oops, sorry, I missed them this morning!
>>
>> Those two patches that Andrew highlights are the fix I did for arm64. Weren't
>> you referring to the corresponding patches you did for riscv, Alex?
>
> These are in mainline:
>
> 1de195dd0e05 riscv: fix set_huge_pte_at() for NAPOT mappings when a swap entry is set
> 117b1bb0cbc7 riscv: handle VM_FAULT_[HWPOISON|HWPOISON_LARGE] faults instead of panicking
Ahh, great - I think they were probably the ones Alex was talking about.
>
> I'm not sure what happened to your "riscv: hugetlb: convert
> set_huge_pte_at() to take vma" - perhaps it was updated.
I modified the approach for v2 (pass size param instead of vma) and it got
squashed into 935d4f0c6dc8 mm: hugetlb: add huge page size param to
set_huge_pte_at(), which is in.