2023-05-21 02:31:31

by Drew Fustini

[permalink] [raw]
Subject: riscv: boot failure for 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")

Hello, I tested 6.4-rc1 on an internal RISC-V SoC and observed a boot
failure on a Store/AMO access fault (exception code 7) in __memset().
stval (e.g. badaddr) was set to 0xffffaf8000000000. This SoC is RV64GC
with Sv48 so it seems that address is the start of the "direct mapping
of all physical memory" [1].

The 6.3 release boots okay and the system is able to operate correctly
with an Ubuntu 23.04 rootfs on eMMC. Therefore, I decided to bisect and
I found the failure begins with 3335068f8721 ("riscv: Use PUD/P4D/PGD
pages for the linear mapping"). The system boots okay with the prior
commit 8589e346bbb6 ("riscv: Move the linear mapping creation in its
own function").

The boot log [2] shows that the fault happens right after buildroot's
init script [3] uses switch_root to execute init from the Ubuntu rootfs
on the eMMC.

DWARF4 is enabled in .config [4] and the decoded stack trace [5] shows:

epc : __memset (/eng/dfustini/gitlab/linux/arch/riscv/lib/memset.S:67)

From memset.S:

Line 67: REG_S a1, 0(t0)

From the oops:

epc : ffffffff81122d6c ra : ffffffff80218504 sp : ffffaf8002e47500
gp : ffffffff82695010 tp : ffffaf8002e2ec00 t0 : ffffaf8000000000
t1 : 0000000000000080 t2 : 0000000000000001 s0 : ffffaf8002e47550
s1 : ffff8d8200000040 a0 : ffffaf8000000000 a1 : 0000000000000000

Thus I think it is trying to store 0x0 to 0xffffaf8000000000 which is
the start of the direct map. From the boot log [2], OpenSBI shows:

Domain0 Region00 : 0x0000000002080000-0x00000000020bffff M: (I,R,W) S/U: ()
Domain0 Region01 : 0x0000008000000000-0x000000800003ffff M: (R,W,X) S/U: ()
Domain0 Region02 : 0x0000000002000000-0x000000000207ffff M: (I,R,W) S/U: ()
Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: (R,W,X) S/U: (R,W,X)

The DDR memory on this SoC starts at 0x8000000000 with size 2GB. The
memory node from the device tree [6]:

memory@8000000000 {
device_type = "memory";
reg = <0x80 0 0x00000000 0x80000000>;
};

I think the direct map address 0xffffaf8000000000 would map to physical
address 0x8000000000. Thus I think the attempted store in S-mode to that
address would violate the PMP settings for Region01.

I do not yet understand why this happens with 3335068f8721 ("riscv: Use
PUD/P4D/PGD pages for the linear mapping") but not for the prior commit
8589e346bbb6 ("riscv: Move the linear mapping creation in its own
function").

One important cavaet: I do have a small diff from mainline to add
support for the eMMC controller in this SoC to sdhci-of-dwcmshc.c. The
output of 'git diff' when 3335068f8721 is checked out [7] shows that
this just adds a new compatible and corresponding sdhci_ops struct.
Everything works ok with this change in both the 6.3 release and the
commit prior to 3335068f8721.

I know it is a bit awkward for me to report a boot failure for an
internal SoC but I am hoping to find a better solution than just
reverting this change in the downstream kernel.

The reason that so few changes are needed to run Linux on this SoC is
that there is a service processor that handles all the low-level tasks
like setting up clocks and configuring various peripheral controllers.
Everything is already setup and ready to go by the time the hart meant
to run OpenSBI+Linux (fw_payload.bin) comes out of reset.

Note: normally Linux runs on all four harts but I reduced to running on
a single hart to simplify diagnosing this boot failure.

Thanks,
Drew

[1] https://docs.kernel.org/riscv/vm-layout.html#risc-v-linux-kernel-sv48
[2] boot log: https://gist.github.com/pdp7/afe78604f477c9e3a3cf0241bcdffcdb
[3] init script: https://gist.github.com/pdp7/8d61bafbca55e987b790433c0353831d
[4] linux .config: https://gist.github.com/pdp7/a4df66f1359a34194bddd32f74ab38a3
[5] stacktrace: https://gist.github.com/pdp7/0524892ea319775ea70e43a54cc842a9
[6] mysoc.dts: https://gist.github.com/pdp7/cd1b2e8e8d3f6047efd53e4ef65664da
[7] git diff: https://gist.github.com/pdp7/581c9e8415da94a29d34ae6d7cc14669


2023-05-21 04:04:54

by Samuel Holland

[permalink] [raw]
Subject: Re: riscv: boot failure for 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")

Hi Drew,

On 5/20/23 21:05, Drew Fustini wrote:
> Hello, I tested 6.4-rc1 on an internal RISC-V SoC and observed a boot
> failure on a Store/AMO access fault (exception code 7) in __memset().
> stval (e.g. badaddr) was set to 0xffffaf8000000000. This SoC is RV64GC
> with Sv48 so it seems that address is the start of the "direct mapping
> of all physical memory" [1].
>
> The 6.3 release boots okay and the system is able to operate correctly
> with an Ubuntu 23.04 rootfs on eMMC. Therefore, I decided to bisect and
> I found the failure begins with 3335068f8721 ("riscv: Use PUD/P4D/PGD
> pages for the linear mapping"). The system boots okay with the prior
> commit 8589e346bbb6 ("riscv: Move the linear mapping creation in its
> own function").
>
> The boot log [2] shows that the fault happens right after buildroot's
> init script [3] uses switch_root to execute init from the Ubuntu rootfs
> on the eMMC.
>
> DWARF4 is enabled in .config [4] and the decoded stack trace [5] shows:
>
> epc : __memset (/eng/dfustini/gitlab/linux/arch/riscv/lib/memset.S:67)
>
> From memset.S:
>
> Line 67: REG_S a1, 0(t0)
>
> From the oops:
>
> epc : ffffffff81122d6c ra : ffffffff80218504 sp : ffffaf8002e47500
> gp : ffffffff82695010 tp : ffffaf8002e2ec00 t0 : ffffaf8000000000
> t1 : 0000000000000080 t2 : 0000000000000001 s0 : ffffaf8002e47550
> s1 : ffff8d8200000040 a0 : ffffaf8000000000 a1 : 0000000000000000
>
> Thus I think it is trying to store 0x0 to 0xffffaf8000000000 which is
> the start of the direct map. From the boot log [2], OpenSBI shows:
>
> Domain0 Region00 : 0x0000000002080000-0x00000000020bffff M: (I,R,W) S/U: ()
> Domain0 Region01 : 0x0000008000000000-0x000000800003ffff M: (R,W,X) S/U: ()
> Domain0 Region02 : 0x0000000002000000-0x000000000207ffff M: (I,R,W) S/U: ()
> Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: (R,W,X) S/U: (R,W,X)
>
> The DDR memory on this SoC starts at 0x8000000000 with size 2GB. The
> memory node from the device tree [6]:
>
> memory@8000000000 {
> device_type = "memory";
> reg = <0x80 0 0x00000000 0x80000000>;
> };
>
> I think the direct map address 0xffffaf8000000000 would map to physical
> address 0x8000000000. Thus I think the attempted store in S-mode to that
> address would violate the PMP settings for Region01.
>
> I do not yet understand why this happens with 3335068f8721 ("riscv: Use
> PUD/P4D/PGD pages for the linear mapping") but not for the prior commit
> 8589e346bbb6 ("riscv: Move the linear mapping creation in its own
> function").

Where does Linux's DTB come from? It should be the one that was modified
by OpenSBI to add a reserved-memory node matching PMP Region01
(fdt_reserved_memory_fixup()).

Before this commit, Linux ignored the first 2 MiB of physical RAM. So if
OpenSBI was loaded in this region, you could get away with ignoring the
firmware-provided DTB; now you actually need to use it, as intended.

Regards,
Samuel


2023-05-21 22:30:24

by Drew Fustini

[permalink] [raw]
Subject: Re: riscv: boot failure for 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")

On Sat, May 20, 2023 at 10:22:36PM -0500, Samuel Holland wrote:
> Hi Drew,
>
> On 5/20/23 21:05, Drew Fustini wrote:
> > Hello, I tested 6.4-rc1 on an internal RISC-V SoC and observed a boot
> > failure on a Store/AMO access fault (exception code 7) in __memset().
> > stval (e.g. badaddr) was set to 0xffffaf8000000000. This SoC is RV64GC
> > with Sv48 so it seems that address is the start of the "direct mapping
> > of all physical memory" [1].
> >
> > The 6.3 release boots okay and the system is able to operate correctly
> > with an Ubuntu 23.04 rootfs on eMMC. Therefore, I decided to bisect and
> > I found the failure begins with 3335068f8721 ("riscv: Use PUD/P4D/PGD
> > pages for the linear mapping"). The system boots okay with the prior
> > commit 8589e346bbb6 ("riscv: Move the linear mapping creation in its
> > own function").
> >
> > The boot log [2] shows that the fault happens right after buildroot's
> > init script [3] uses switch_root to execute init from the Ubuntu rootfs
> > on the eMMC.
> >
> > DWARF4 is enabled in .config [4] and the decoded stack trace [5] shows:
> >
> > epc : __memset (/eng/dfustini/gitlab/linux/arch/riscv/lib/memset.S:67)
> >
> > From memset.S:
> >
> > Line 67: REG_S a1, 0(t0)
> >
> > From the oops:
> >
> > epc : ffffffff81122d6c ra : ffffffff80218504 sp : ffffaf8002e47500
> > gp : ffffffff82695010 tp : ffffaf8002e2ec00 t0 : ffffaf8000000000
> > t1 : 0000000000000080 t2 : 0000000000000001 s0 : ffffaf8002e47550
> > s1 : ffff8d8200000040 a0 : ffffaf8000000000 a1 : 0000000000000000
> >
> > Thus I think it is trying to store 0x0 to 0xffffaf8000000000 which is
> > the start of the direct map. From the boot log [2], OpenSBI shows:
> >
> > Domain0 Region00 : 0x0000000002080000-0x00000000020bffff M: (I,R,W) S/U: ()
> > Domain0 Region01 : 0x0000008000000000-0x000000800003ffff M: (R,W,X) S/U: ()
> > Domain0 Region02 : 0x0000000002000000-0x000000000207ffff M: (I,R,W) S/U: ()
> > Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: (R,W,X) S/U: (R,W,X)
> >
> > The DDR memory on this SoC starts at 0x8000000000 with size 2GB. The
> > memory node from the device tree [6]:
> >
> > memory@8000000000 {
> > device_type = "memory";
> > reg = <0x80 0 0x00000000 0x80000000>;
> > };
> >
> > I think the direct map address 0xffffaf8000000000 would map to physical
> > address 0x8000000000. Thus I think the attempted store in S-mode to that
> > address would violate the PMP settings for Region01.
> >
> > I do not yet understand why this happens with 3335068f8721 ("riscv: Use
> > PUD/P4D/PGD pages for the linear mapping") but not for the prior commit
> > 8589e346bbb6 ("riscv: Move the linear mapping creation in its own
> > function").
>
> Where does Linux's DTB come from? It should be the one that was modified
> by OpenSBI to add a reserved-memory node matching PMP Region01
> (fdt_reserved_memory_fixup()).
>
> Before this commit, Linux ignored the first 2 MiB of physical RAM. So if
> OpenSBI was loaded in this region, you could get away with ignoring the
> firmware-provided DTB; now you actually need to use it, as intended.

The address of the dtb is passed by the boot code to OpenSBI. I had been
using OpenSBI master from Jan 9: 001106d ("docs: Update domain's region
permissions and requirements"). The kernel receives the device tree from
OpenSBI but I had never actually dumped it from sysfs.

I checked out the prior kernel commit 8589e346bbb6 ("riscv: Move the
linear mapping creation in its own function") and ran "dtc -I fs
/sys/firmware/devicetree/base/" to dump the device tree [1]. This showed
that the reserved-memory node was blank.

Jessica pointed out to me on #riscv irc that this was fixed in OpenSBI
on Jan 21 with: a990309 ("lib: utils: Fix reserved memory node for
firmware memory"). Therefore, I updated to the current OpenSBI master:
33f1722 ("lib: sbi: Document sbi_ecall_extension members") from May 15.
The device tree that OpenSBI passes to the kernel now has
"mmode_resv0@80,0" and "mmode_resv1@80,20000".

Furthermore, my system now boots okay with 3335068f8721 ("riscv: Use
PUD/P4D/PGD pages for the linear mapping") so the problem was just that
I had been using an OpenSBI that was slightly too old.

Thanks,
Drew

[1] https://gist.github.com/pdp7/71ca465997274e11953b26861e36144f
[2] https://gist.github.com/pdp7/90b4632146fc55625735fa288d80532b