2024-01-18 08:35:46

by Nylon Chen

[permalink] [raw]
Subject: Re: Fwd: [PATCH v8 0/4] riscv: Use PUD/P4D/PGD pages for the linear mapping

> On 3/23/23 15:55, Anup Patel wrote:
> > On Thu, Mar 23, 2023 at 6:24 PM Alexandre Ghiti <[email protected]> wrote:
> >> Hi Anup,
> >>
> >> On Thu, Mar 23, 2023 at 1:18 PM Anup Patel <[email protected]> wrote:
> >>> Hi Alex,
> >>>
> >>> On Thu, Mar 16, 2023 at 6:48 PM Alexandre Ghiti <[email protected]> wrote:
> >>>> This patchset intends to improve tlb utilization by using hugepages for
> >>>> the linear mapping.
> >>>>
> >>>> As reported by Anup in v6, when STRICT_KERNEL_RWX is enabled, we must
> >>>> take care of isolating the kernel text and rodata so that they are not
> >>>> mapped with a PUD mapping which would then assign wrong permissions to
> >>>> the whole region: it is achieved by introducing a new memblock API.
> >>>>
> >>>> Another patch makes use of this new API in arm64 which used some sort of
> >>>> hack to solve this issue: it was built/boot tested successfully.
> >>>>
> >>>> base-commit-tag: v6.3-rc1
> >>>>
> >>>> v8:
> >>>> - Fix rv32, as reported by Anup
> >>>> - Do not modify memblock_isolate_range and fixes comment, as suggested by Mike
> >>>> - Use the new memblock API for crash kernel too in arm64, as suggested by Andrew
> >>>> - Fix arm64 double mapping (which to me did not work in v7), but ends up not
> >>>> being pretty at all, will wait for comments from arm64 reviewers, but
> >>>> this patch can easily be dropped if they do not want it.
> >>>>
> >>>> v7:
> >>>> - Fix Anup bug report by introducing memblock_isolate_memory which
> >>>> allows us to split the memblock mappings and then avoid to map the
> >>>> the PUD which contains the kernel as read only
> >>>> - Add a patch to arm64 to use this newly introduced API
> >>>>
> >>>> v6:
> >>>> - quiet LLVM warning by casting phys_ram_base into an unsigned long
> >>>>
> >>>> v5:
> >>>> - Fix nommu builds by getting rid of riscv_pfn_base in patch 1, thanks
> >>>> Conor
> >>>> - Add RB from Andrew
> >>>>
> >>>> v4:
> >>>> - Rebase on top of v6.2-rc3, as noted by Conor
> >>>> - Add Acked-by Rob
> >>>>
> >>>> v3:
> >>>> - Change the comment about initrd_start VA conversion so that it fits
> >>>> ARM64 and RISCV64 (and others in the future if needed), as suggested
> >>>> by Rob
> >>>>
> >>>> v2:
> >>>> - Add a comment on why RISCV64 does not need to set initrd_start/end that
> >>>> early in the boot process, as asked by Rob
> >>>>
> >>>> Alexandre Ghiti (4):
> >>>> riscv: Get rid of riscv_pfn_base variable
> >>>> mm: Introduce memblock_isolate_memory
> >>>> arm64: Make use of memblock_isolate_memory for the linear mapping
> >>>> riscv: Use PUD/P4D/PGD pages for the linear mapping
> >>> Kernel boot fine on RV64 but there is a failure which is still not
> >>> addressed. You can see this failure as following message in
> >>> kernel boot log:
> >>> 0.000000] Failed to add a System RAM resource at 80200000
> >> Hmmm I don't get that in any of my test configs, would you mind
> >> sharing yours and your qemu command line?
> > Try alexghiti_test branch at
> > https://github.com/avpatel/linux.git
> >
> > I am building the kernel using defconfig and my rootfs is
> > based on busybox.
> >
> > My QEMU command is:
> > qemu-system-riscv64 -M virt -m 512M -nographic -bios
> > opensbi/build/platform/generic/firmware/fw_dynamic.bin -kernel
> > ./build-riscv64/arch/riscv/boot/Image -append "root=/dev/ram rw
> > console=ttyS0 earlycon" -initrd ./rootfs_riscv64.img -smp 4
>
>
> So splitting memblock.memory is the culprit, it "confuses" the resources
> addition and I can only find hacky ways to fix that...
Hi Alexandre,

We encountered the same error as Anup. After adding your patch
(3335068f87217ea59d08f462187dc856652eea15), we will not encounter the
error again.

What I have observed so far is

- before your patch
When merging consecutive memblocks, if the memblock types are different,
they will be merged into reserved
- after your patch
When consecutive memblocks are merged, if the memblock types are
different, they will be merged into memory.

Such a result will cause the memory location of OpenSBI to be changed
from reserved to memory. Will this have any side effects?
>
> So given that the arm64 patch with the new API is not pretty and that
> the simplest solution is to re-merge the memblock regions afterwards
> (which is done by memblock_clear_nomap), I'll drop the new API and the
> arm64 patch to use the nomap API like arm64: I'll take advantage of that
> to clean setup_vm_final which I have wanted to do for a long time.
>
> @Mike Thanks for you reviews!
>
> @Anup Thanks for all your bug reports on this patchset, I have to
> improve my test flow (it is in the work :)).
>
>
> > Regards,
> > Anup
> >
> >> Thanks
> >>
> >>> Regards,
> >>> Anup
> >>>
> >>>> arch/arm64/mm/mmu.c | 25 +++++++++++------
> >>>> arch/riscv/include/asm/page.h | 19 +++++++++++--
> >>>> arch/riscv/mm/init.c | 53 ++++++++++++++++++++++++++++-------
> >>>> arch/riscv/mm/physaddr.c | 16 +++++++++++
> >>>> drivers/of/fdt.c | 11 ++++----
> >>>> include/linux/memblock.h | 1 +
> >>>> mm/memblock.c | 20 +++++++++++++
> >>>> 7 files changed, 119 insertions(+), 26 deletions(-)
> >>>>
> >>>> --
> >>>> 2.37.2
> >>>>
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv


2024-01-18 13:02:34

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: Fwd: [PATCH v8 0/4] riscv: Use PUD/P4D/PGD pages for the linear mapping

Hi Nylon,

On Thu, Jan 18, 2024 at 9:23 AM Nylon Chen <[email protected]> wrote:
>
> > On 3/23/23 15:55, Anup Patel wrote:
> > > On Thu, Mar 23, 2023 at 6:24 PM Alexandre Ghiti <[email protected]> wrote:
> > >> Hi Anup,
> > >>
> > >> On Thu, Mar 23, 2023 at 1:18 PM Anup Patel <[email protected]> wrote:
> > >>> Hi Alex,
> > >>>
> > >>> On Thu, Mar 16, 2023 at 6:48 PM Alexandre Ghiti <[email protected]> wrote:
> > >>>> This patchset intends to improve tlb utilization by using hugepages for
> > >>>> the linear mapping.
> > >>>>
> > >>>> As reported by Anup in v6, when STRICT_KERNEL_RWX is enabled, we must
> > >>>> take care of isolating the kernel text and rodata so that they are not
> > >>>> mapped with a PUD mapping which would then assign wrong permissions to
> > >>>> the whole region: it is achieved by introducing a new memblock API.
> > >>>>
> > >>>> Another patch makes use of this new API in arm64 which used some sort of
> > >>>> hack to solve this issue: it was built/boot tested successfully.
> > >>>>
> > >>>> base-commit-tag: v6.3-rc1
> > >>>>
> > >>>> v8:
> > >>>> - Fix rv32, as reported by Anup
> > >>>> - Do not modify memblock_isolate_range and fixes comment, as suggested by Mike
> > >>>> - Use the new memblock API for crash kernel too in arm64, as suggested by Andrew
> > >>>> - Fix arm64 double mapping (which to me did not work in v7), but ends up not
> > >>>> being pretty at all, will wait for comments from arm64 reviewers, but
> > >>>> this patch can easily be dropped if they do not want it.
> > >>>>
> > >>>> v7:
> > >>>> - Fix Anup bug report by introducing memblock_isolate_memory which
> > >>>> allows us to split the memblock mappings and then avoid to map the
> > >>>> the PUD which contains the kernel as read only
> > >>>> - Add a patch to arm64 to use this newly introduced API
> > >>>>
> > >>>> v6:
> > >>>> - quiet LLVM warning by casting phys_ram_base into an unsigned long
> > >>>>
> > >>>> v5:
> > >>>> - Fix nommu builds by getting rid of riscv_pfn_base in patch 1, thanks
> > >>>> Conor
> > >>>> - Add RB from Andrew
> > >>>>
> > >>>> v4:
> > >>>> - Rebase on top of v6.2-rc3, as noted by Conor
> > >>>> - Add Acked-by Rob
> > >>>>
> > >>>> v3:
> > >>>> - Change the comment about initrd_start VA conversion so that it fits
> > >>>> ARM64 and RISCV64 (and others in the future if needed), as suggested
> > >>>> by Rob
> > >>>>
> > >>>> v2:
> > >>>> - Add a comment on why RISCV64 does not need to set initrd_start/end that
> > >>>> early in the boot process, as asked by Rob
> > >>>>
> > >>>> Alexandre Ghiti (4):
> > >>>> riscv: Get rid of riscv_pfn_base variable
> > >>>> mm: Introduce memblock_isolate_memory
> > >>>> arm64: Make use of memblock_isolate_memory for the linear mapping
> > >>>> riscv: Use PUD/P4D/PGD pages for the linear mapping
> > >>> Kernel boot fine on RV64 but there is a failure which is still not
> > >>> addressed. You can see this failure as following message in
> > >>> kernel boot log:
> > >>> 0.000000] Failed to add a System RAM resource at 80200000
> > >> Hmmm I don't get that in any of my test configs, would you mind
> > >> sharing yours and your qemu command line?
> > > Try alexghiti_test branch at
> > > https://github.com/avpatel/linux.git
> > >
> > > I am building the kernel using defconfig and my rootfs is
> > > based on busybox.
> > >
> > > My QEMU command is:
> > > qemu-system-riscv64 -M virt -m 512M -nographic -bios
> > > opensbi/build/platform/generic/firmware/fw_dynamic.bin -kernel
> > > ./build-riscv64/arch/riscv/boot/Image -append "root=/dev/ram rw
> > > console=ttyS0 earlycon" -initrd ./rootfs_riscv64.img -smp 4
> >
> >
> > So splitting memblock.memory is the culprit, it "confuses" the resources
> > addition and I can only find hacky ways to fix that...
> Hi Alexandre,
>
> We encountered the same error as Anup. After adding your patch
> (3335068f87217ea59d08f462187dc856652eea15), we will not encounter the
> error again.
>
> What I have observed so far is
>
> - before your patch
> When merging consecutive memblocks, if the memblock types are different,
> they will be merged into reserved
> - after your patch
> When consecutive memblocks are merged, if the memblock types are
> different, they will be merged into memory.
>
> Such a result will cause the memory location of OpenSBI to be changed
> from reserved to memory. Will this have any side effects?

I guess it will end up in the memory pool and pages from openSBI
region will be allocated, so we should see very quickly bad stuff
happening (either PMP violation or M-mode ecall never
returning/trapping/etc).

But I don't observe the same thing, I always see the openSBI region
being reserved:

reserved[0x0] [0x0000000080000000-0x000000008007ffff],
0x0000000000080000 bytes flags: 0x0

Can you elaborate a bit more about "When consecutive memblocks are
merged, if the memblock types are different, they will be merged into
memory"? Where/when does this merge happen? Can you give me a config
file and a kernel revision so that I can take a look?

Thanks,

Alex

> >
> > So given that the arm64 patch with the new API is not pretty and that
> > the simplest solution is to re-merge the memblock regions afterwards
> > (which is done by memblock_clear_nomap), I'll drop the new API and the
> > arm64 patch to use the nomap API like arm64: I'll take advantage of that
> > to clean setup_vm_final which I have wanted to do for a long time.
> >
> > @Mike Thanks for you reviews!
> >
> > @Anup Thanks for all your bug reports on this patchset, I have to
> > improve my test flow (it is in the work :)).
> >
> >
> > > Regards,
> > > Anup
> > >
> > >> Thanks
> > >>
> > >>> Regards,
> > >>> Anup
> > >>>
> > >>>> arch/arm64/mm/mmu.c | 25 +++++++++++------
> > >>>> arch/riscv/include/asm/page.h | 19 +++++++++++--
> > >>>> arch/riscv/mm/init.c | 53 ++++++++++++++++++++++++++++-------
> > >>>> arch/riscv/mm/physaddr.c | 16 +++++++++++
> > >>>> drivers/of/fdt.c | 11 ++++----
> > >>>> include/linux/memblock.h | 1 +
> > >>>> mm/memblock.c | 20 +++++++++++++
> > >>>> 7 files changed, 119 insertions(+), 26 deletions(-)
> > >>>>
> > >>>> --
> > >>>> 2.37.2
> > >>>>
> > > _______________________________________________
> > > linux-riscv mailing list
> > > [email protected]
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv