2024-02-22 20:59:51

by Conor Dooley

[permalink] [raw]
Subject: Re: [RFC PATCH 0/1] riscv: dts: Allow BUILTIN_DTB for all socs

On Wed, Feb 21, 2024 at 10:28:08PM +0800, Yangyu Chen wrote:
> On Wed, 2024-02-21 at 11:30 +0000, Conor Dooley wrote:
> > Hey,
> >
> > On Wed, Feb 21, 2024 at 03:01:53AM +0800, Yangyu Chen wrote:
> > > The BUILTIN_DTB kernel feature on RISC-V only works on K210 SoC
> > > only. This
> > > patch moved this configuration to entire riscv.
> >
> > To be honest, I would rather delete BUILTIN_DTB (and the
> > configurations
> > that depend on it) than expand its usefulness.
> >
>
> I agree it’s useless for most platforms because we need to start SBI
> before kernel on RISC-V except NOMMU M-Mode Linux and SBI also need a
> DT to work. However, it has been there for M-Mode K210 and it is set by
> default for XIP kernel. So there might eventually be another patch to
> support some new soc that will do this like this patch.

To be clear, I was not suggesting that it was useless. I was saying that
I would rather reduce the number of configurations that use builtin dtbs
than increase the level of support for it.

>
> > > Although BUILTIN_DTB is not a good choice for most platforms, it is
> > > likely
> > > to be a debug feature when some bootloader will always override
> > > something
> > > like the memory node in the device tree to adjust the memory size
> > > from SPD
> > > or configuration resistor, which makes it hard to do some
> > > debugging.
> >
> > My inclination here is to say "fix your bootloader" and if that's not
> > possible, chainload a bootloader that allows you control over
> > modifications to your devicetree.
> >
>
> Chainload a bootloader like S-Mode U-Boot on some platforms is hard due
> to some drivers like pcie controller does not come to the mainline repo
> of the bootloader, and some bootloader source repos provided by the
> vendor may require specific versions of the compiler to work, which
> makes users not easy to do some kernel debugging if change DT is
> needed. The simplest way to do this I can imagine is to write a simple
> bootloader by myself link the kernel binary and the dtb I want to it
> and replace the a1 register point to the dtb address before jumping to
> the kernel. However, kernel has this feature, why should I do it
> manually rather than provide a more generic patch for everyone with
> this need to use?
>
> > > As an
> > > example, some platforms with numa like sg2042 only support sv39
> > > will fail
> > > to boot when there is no ZONE_HIGHMEM patch with 128G memory. If we
> > > want
> > > a kernel without this patch to boot, we need to write the memory
> > > nodes
> > > in the DT manually.
> >
> > If, as Alex suggests, there's a way to gain support some more memory
> > in
> > sv39, we should do so - but it is worth mentioning that highmem is on
> > the
> > removal list for the kernel, so mainline support for that is highly
> > unlikely.
> >
>
> Yes. But I’m debugging some mm performance issues on the sg2042 kernel.
> Specifically, it’s about the IPI latency when doing rfence on
> sfence.vma or fence.i. I would like to reduce the memory size and allow
> the mainline kernel to boot and test without taking some out-of-tree
> kernel patches. If I remove some DIMM modules from the board to reduce
> the memory size, it will also lose some memory channels and even leave
> some numa nodes with zero memory, and the compatible DIMM module is
> hard to find.

I'm not really sure how this relates to my comment about HIGHMEM. If
Alex is able to give you the extra 4 GiB of memory that he says there is
space for in the memory map, will the device boot properly?

> > > Also, changing DT on some platforms is not easy. For Milk-V
> > > Pioneer, the
> > > boot procedure is ZSBL -> OpenSBI -> LinuxBoot -> Linux. If DT gets
> > > changed, OpenSBI or LinuxBoot may refuse to boot. And there is some
> > > bug on
> > > LinuxBoot now which does not consume --dtb argument on kexec and
> > > always
> > > uses DT from memory.
> >
> > I don't use Linuxboot, but let me try to understand. Linuxboot uses
> > kexec
> > to boot the main Linux kernel, but the dtb you want to use is not
> > used, and
> > instead the one that Linuxboot itself was booted with is used?
> >
> > It sounds like Linuxboot has a --dtb argumet that is meant to be used
> > to
> > set the dtb for the next stage, but that argument is being ignored?
> >
>
> Yes. That’s correct.
>
> > That sounds like a pretty serious issue with Linuxboot which should
> > be
> > fixed - what am I missing?
> >
>
> Sure, that should be fixed in the LinuxBoot. However, I think not every
> kernel developer should fix some complex bootloader like LinuxBoot
> which is built upon the linux kernel with a huge initrd rootfs and runs
> some userspace tools to support the boot process. If something is hard
> to control, skip it, and doing some override for debugging will be a
> better choice.

Has anyone even /reported/ the issues with LinuxBoot to the LinuxBoot
developers? Without that being fixed, there's unlikely to ever be
mainstream distro support for it, since they're going to have to build
kernels for it alone.

> > > So I would like to do debugging on DT using
> > > BUILTIN_DTB, which makes it very simple,
> >
> > > I can even install the kernel in
> > > the distro's way and provide a kernel package for other users to
> > > test.
> >
> > I'm not sure what you mean by this, other distros manage to create
> > kernel packages without using builtin dtbs.
> >
>
> I mean I can provide a distro package like Debian .deb and distribute
> it to other users to test without changing their dtb from the entire
> boot process.

Other distros, like Ubuntu, manage to do this without relying on builtin
dtbs. I suppose this comes down to having bootloaders that

> Because changing the DT from the entire boot process
> might prevent their vendor-provided OpenSBI or LinuxBoot from working.
> Some vendor kernels may be developed out-of-tree and do not use the dt-
> binding from mainline. Even for very basic CLINT and PLIC dt bindings.

Which is verging on ridiculous at this point. Does the sg2042 also have
a version of OpenSBI that is not capable of booting a mainline kernel?

> It is only for testing, not for the production environment.

If things are just for testing, I'm not particularly keen on merging on
that basis alone. We all have various bits of testing code that doesn't
end up being merged to mainline. That said, it is broken at present and
its hard to argue against fixing it and any patch fixing it would
ultimately look very similar to your patch here.

> I want this feature to allow more people to participate in debugging
> some kernel issues without taking a huge amount of time to deal with
> bootloader issues about changing the DT. I think it will be good for
> our under-development RISC-V community.

And on the other hand, it provides no incentive for vendors to fix
broken bootloaders or firmware, which is some we suffer from on RISC-V,
in particular vendors that ship T-Head's vendor copy of OpenSBI.

> Imagine we hardly change the
> ACPI table for x86 machines but we sometimes change the DT for
> ARM/RISC-V board, right?

Usually we change them because nobody gets things "right" and we end up
having different stuff in mainline to what the vendor did. Usually also
a vendor has a relatively complete description in their vendor tree, but
things only trickle into mainline, so mainline ends up requiring regular
dtb updates until a platform stabilises. More infrequently, changes are
needed for bugfixes.

The other thing you do is compare to the ACPI table. I don't think it is
quite apples to apples there - those machines mostly have devices on
discoverable buses etc. If they had the same number of non discoverable
devices, I think you'd end up having to do more BIOS updates etc.

> Also, some SoCs that run M-Mode NOMMU Linux
> may need it in the future like K210 for XIP without a prior bootloader.

And the k210 is one of the things that is on the chopping block at the
moment. It's removal was discussed at LPC this year, with Damien
surprisingly agreeing to its removal. FWIW, builtin dtb is not required
for XIP.

BTW, I noticed that your patch only removes one of the $(addsuffix)
calls in a platform makefile.
Thanks,
Conor.


Attachments:
(No filename) (8.24 kB)
signature.asc (235.00 B)
Download all attachments

2024-02-23 07:42:23

by Yangyu Chen

[permalink] [raw]
Subject: Re: [RFC PATCH 0/1] riscv: dts: Allow BUILTIN_DTB for all socs



On 2024/2/23 04:59, Conor Dooley wrote:
> On Wed, Feb 21, 2024 at 10:28:08PM +0800, Yangyu Chen wrote:
>> On Wed, 2024-02-21 at 11:30 +0000, Conor Dooley wrote:
>>> Hey,
>>>
>>> On Wed, Feb 21, 2024 at 03:01:53AM +0800, Yangyu Chen wrote:
>>>> The BUILTIN_DTB kernel feature on RISC-V only works on K210 SoC
>>>> only. This
>>>> patch moved this configuration to entire riscv.
>>>
>>> To be honest, I would rather delete BUILTIN_DTB (and the
>>> configurations
>>> that depend on it) than expand its usefulness.
>>>
>>
>> I agree it’s useless for most platforms because we need to start SBI
>> before kernel on RISC-V except NOMMU M-Mode Linux and SBI also need a
>> DT to work. However, it has been there for M-Mode K210 and it is set by
>> default for XIP kernel. So there might eventually be another patch to
>> support some new soc that will do this like this patch.
>
> To be clear, I was not suggesting that it was useless. I was saying that
> I would rather reduce the number of configurations that use builtin dtbs
> than increase the level of support for it.
>

I see.

>>
>>>> Although BUILTIN_DTB is not a good choice for most platforms, it is
>>>> likely
>>>> to be a debug feature when some bootloader will always override
>>>> something
>>>> like the memory node in the device tree to adjust the memory size
>>>> from SPD
>>>> or configuration resistor, which makes it hard to do some
>>>> debugging.
>>>
>>> My inclination here is to say "fix your bootloader" and if that's not
>>> possible, chainload a bootloader that allows you control over
>>> modifications to your devicetree.
>>>
>>
>> Chainload a bootloader like S-Mode U-Boot on some platforms is hard due
>> to some drivers like pcie controller does not come to the mainline repo
>> of the bootloader, and some bootloader source repos provided by the
>> vendor may require specific versions of the compiler to work, which
>> makes users not easy to do some kernel debugging if change DT is
>> needed. The simplest way to do this I can imagine is to write a simple
>> bootloader by myself link the kernel binary and the dtb I want to it
>> and replace the a1 register point to the dtb address before jumping to
>> the kernel. However, kernel has this feature, why should I do it
>> manually rather than provide a more generic patch for everyone with
>> this need to use?
>>
>>>> As an
>>>> example, some platforms with numa like sg2042 only support sv39
>>>> will fail
>>>> to boot when there is no ZONE_HIGHMEM patch with 128G memory. If we
>>>> want
>>>> a kernel without this patch to boot, we need to write the memory
>>>> nodes
>>>> in the DT manually.
>>>
>>> If, as Alex suggests, there's a way to gain support some more memory
>>> in
>>> sv39, we should do so - but it is worth mentioning that highmem is on
>>> the
>>> removal list for the kernel, so mainline support for that is highly
>>> unlikely.
>>>
>>
>> Yes. But I’m debugging some mm performance issues on the sg2042 kernel.
>> Specifically, it’s about the IPI latency when doing rfence on
>> sfence.vma or fence.i. I would like to reduce the memory size and allow
>> the mainline kernel to boot and test without taking some out-of-tree
>> kernel patches. If I remove some DIMM modules from the board to reduce
>> the memory size, it will also lose some memory channels and even leave
>> some numa nodes with zero memory, and the compatible DIMM module is
>> hard to find.
>
> I'm not really sure how this relates to my comment about HIGHMEM. If
> Alex is able to give you the extra 4 GiB of memory that he says there is
> space for in the memory map, will the device boot properly?
>

That is I said I want "mainline kernel to boot and test without taking
some out-of-tree kernel patches" as it doesn't come to mainline now. And
I don't see any performance issues on sifive socs with the mainline
kernel, but it doesn't have many cores like sg2042 either. Whatever, it
is a reason for simplifying the debug process on performance, not for
getting 128G memory on sg2042 boot properly.

>>>> Also, changing DT on some platforms is not easy. For Milk-V
>>>> Pioneer, the
>>>> boot procedure is ZSBL -> OpenSBI -> LinuxBoot -> Linux. If DT gets
>>>> changed, OpenSBI or LinuxBoot may refuse to boot. And there is some
>>>> bug on
>>>> LinuxBoot now which does not consume --dtb argument on kexec and
>>>> always
>>>> uses DT from memory.
>>>
>>> I don't use Linuxboot, but let me try to understand. Linuxboot uses
>>> kexec
>>> to boot the main Linux kernel, but the dtb you want to use is not
>>> used, and
>>> instead the one that Linuxboot itself was booted with is used?
>>>
>>> It sounds like Linuxboot has a --dtb argumet that is meant to be used
>>> to
>>> set the dtb for the next stage, but that argument is being ignored?
>>>
>>
>> Yes. That’s correct.
>>
>>> That sounds like a pretty serious issue with Linuxboot which should
>>> be
>>> fixed - what am I missing?
>>>
>>
>> Sure, that should be fixed in the LinuxBoot. However, I think not every
>> kernel developer should fix some complex bootloader like LinuxBoot
>> which is built upon the linux kernel with a huge initrd rootfs and runs
>> some userspace tools to support the boot process. If something is hard
>> to control, skip it, and doing some override for debugging will be a
>> better choice.
>
> Has anyone even /reported/ the issues with LinuxBoot to the LinuxBoot
> developers? Without that being fixed, there's unlikely to ever be
> mainstream distro support for it, since they're going to have to build
> kernels for it alone.
>

I created a github issue on sophgo/bootloader-riscv [1] . Seems no body
reported it before. Yeah it will be better to fix LinuxBoot to solve my
own need for debugging.

>>>> So I would like to do debugging on DT using
>>>> BUILTIN_DTB, which makes it very simple,
>>>
>>>> I can even install the kernel in
>>>> the distro's way and provide a kernel package for other users to
>>>> test.
>>>
>>> I'm not sure what you mean by this, other distros manage to create
>>> kernel packages without using builtin dtbs.
>>>
>>
>> I mean I can provide a distro package like Debian .deb and distribute
>> it to other users to test without changing their dtb from the entire
>> boot process.
>
> Other distros, like Ubuntu, manage to do this without relying on builtin
> dtbs. I suppose this comes down to having bootloaders that
>
>> Because changing the DT from the entire boot process
>> might prevent their vendor-provided OpenSBI or LinuxBoot from working.
>> Some vendor kernels may be developed out-of-tree and do not use the dt-
>> binding from mainline. Even for very basic CLINT and PLIC dt bindings.
>
> Which is verging on ridiculous at this point. Does the sg2042 also have
> a version of OpenSBI that is not capable of booting a mainline kernel?
>

Yes, their vendor provided old OpenSBI can not parse aclint dt binding
from the mainline dts, so there will be no timer for OpenSBI and refused
to boot. That can be solved if I change the dt and use mainline opensbi
and cherry-pick some vendor patches to get this work.

>> It is only for testing, not for the production environment.
>
> If things are just for testing, I'm not particularly keen on merging on
> that basis alone. We all have various bits of testing code that doesn't
> end up being merged to mainline. That said, it is broken at present and
> its hard to argue against fixing it and any patch fixing it would
> ultimately look very similar to your patch here.
>

OK. You convinced me for this reason.

>> I want this feature to allow more people to participate in debugging
>> some kernel issues without taking a huge amount of time to deal with
>> bootloader issues about changing the DT. I think it will be good for
>> our under-development RISC-V community.
>
> And on the other hand, it provides no incentive for vendors to fix
> broken bootloaders or firmware, which is some we suffer from on RISC-V,
> in particular vendors that ship T-Head's vendor copy of OpenSBI.
>

That's true.

>> Imagine we hardly change the
>> ACPI table for x86 machines but we sometimes change the DT for
>> ARM/RISC-V board, right?
>
> Usually we change them because nobody gets things "right" and we end up
> having different stuff in mainline to what the vendor did. Usually also
> a vendor has a relatively complete description in their vendor tree, but
> things only trickle into mainline, so mainline ends up requiring regular
> dtb updates until a platform stabilises. More infrequently, changes are
> needed for bugfixes.
>
> The other thing you do is compare to the ACPI table. I don't think it is
> quite apples to apples there - those machines mostly have devices on
> discoverable buses etc. If they had the same number of non discoverable
> devices, I think you'd end up having to do more BIOS updates etc.
>

OK.

>> Also, some SoCs that run M-Mode NOMMU Linux
>> may need it in the future like K210 for XIP without a prior bootloader.
>
> And the k210 is one of the things that is on the chopping block at the
> moment. It's removal was discussed at LPC this year, with Damien
> surprisingly agreeing to its removal. FWIW, builtin dtb is not required
> for XIP.
>
> BTW, I noticed that your patch only removes one of the $(addsuffix)
> calls in a platform makefile.
> Thanks,
> Conor.

To sum up, I agree with the reasons to refuse it for debugging purposes.
I am wondering what to do next. After reviewing the code carefully, I
found this feature not only for K210 but also for other SoCs. But for
other SoCs, it is broken as it will link multiple dtbs to the kernel,
but the kernel will always pick up the first dtb to use as discussed on
this thread [2]. That is because SOC_BUILTIN_DTB_DECLARE is removed from
this patch [3] then no code to select for multiple dtbs now. And
makefiles on other soc which is on the mainline kernel currently will
build every dtb object file from this patch [4]. So this feature for
other SoCs is broken now.

Choices might be one of the following:

1. Remove BUILTIN_DTB feature if K210 support get removed
2. Continue to add this feature to get this work for other socs

I prefer to continue to get this feature to work. Not only for my
debugging purposes but also for fixes.

[1] https://github.com/sophgo/bootloader-riscv/issues/73
[2]
https://lore.kernel.org/linux-riscv/CAK7LNATt_56mO2Le4v4EnPnAfd3gC8S_Sm5-GCsfa=qXy=8Lrg@mail.gmail.com/
[3]
https://lore.kernel.org/linux-riscv/[email protected]/
[4]
https://lore.kernel.org/linux-riscv/[email protected]/