2022-05-28 18:48:14

by syzbot

[permalink] [raw]
Subject: [syzbot] riscv/fixes test error: lost connection to test machine

Hello,

syzbot found the following issue on:

HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: riscv64

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]



---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


2022-05-28 19:07:59

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On 5/27/22 19:12, Dmitry Vyukov wrote:
> On Fri, 27 May 2022 at 19:04, Dmitry Vyukov <[email protected]> wrote:
>> On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
>> <[email protected]> wrote:
>>> On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
>>>> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
>>>> <[email protected]> wrote:
>>>>> On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
>>>>>> On Fri, 27 May 2022 at 14:55, syzbot
>>>>>> <[email protected]> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following issue on:
>>>>>>>
>>>>>>> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
>>>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
>>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
>>>>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>>> userspace arch: riscv64
>>>>>>>
>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>>> Reported-by: [email protected]
>>>>>> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
>>>>>> processes started crashing with:
>>>>>>
>>>>>> 1970/01/01 00:06:55 fuzzer started
>>>>>> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
>>>>>> packed=0xffff5908a9400001 -> node=0xffff5908a940
>>>>>> fatal error: lfstack.push
>>>>>> runtime stack:
>>>>>> runtime.throw({0x30884c, 0xc})
>>>>>> /usr/local/go/src/runtime/panic.go:1198 +0x60
>>>>>> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
>>>>>> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
>>>>>>
>>>>>> Go runtime tries to shove some data into the upper 16 bits of pointers
>>>>>> assuming they are unused.
>>>>>> However, the original pointer node=0xffffff5908a940 suggest riscv now
>>>>>> has 56-bit users-space address space?
>>>>>
>>>>> Yes, sv57 was merged recently.
>>>>>
>>>>>> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
>>>>>> "
>>>>>> The RISC-V privileged architecture document states that the 64bit addresses
>>>>>> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
>>>>>> occur.":
>>>>>
>>>>> Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
>>>>>
>>>>> The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
>>>>>
>>>>> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
>>>>> index 6dbe9b541f..a64b50ed75 100644
>>>>> --- a/target/riscv/csr.c
>>>>> +++ b/target/riscv/csr.c
>>>>> @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
>>>>> [VM_1_10_MBARE] = 1,
>>>>> [VM_1_10_SV39] = 1,
>>>>> [VM_1_10_SV48] = 1,
>>>>> - [VM_1_10_SV57] = 1
>>>>> + [VM_1_10_SV57] = 0
>>>>> };
>>>>>
>>>>> /* Machine Information Registers */
>>>>>
>>>>>> ...
>>>>>> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
>>>>>> user-space virtual memory, different per mm
>>>>>> "
>>>> There is no kernel config to force SV48/39, right?
>>>
>>> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
>> +original CC list
>>
>> FTR sent Go runtime change to support SV57:
>> https://go-review.googlesource.com/c/go/+/409055
>
>
> Is CONFIG_CMDLINE broken on riscv?
> I am running with:
>
> CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0
> sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
> nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000
> nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000
> nf-conntrack-sane.ports=20000 binder.debug_mask=0
> rcupdate.rcu_expedited=1 no_hash_pointers page_owner=on
> sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4
> secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1
> msr.allow_writes=off dummy_hcd.num=2 smp.csd_lock_timeout=300000
> watchdog_thresh=165 workqueue.watchdog_thresh=420
> sysctl.net.core.netdev_unregister_timeout_secs=420 panic_on_warn=1"


This command line is 608-character long, but we are still stuck with the
default COMMAND_LINE_SIZE to 512, I imagine that it is the problem. I
had proposed a patch last year to bump that to 1024, but it never got
merged
https://lore.kernel.org/lkml/CAEn-LTqTXCEC=bXTvGyo8SNL0JMWRKtiSwQB7R=Pc4uhxZUruA@mail.gmail.com/T/#m4b45019dc0f5573f2a50c1f6007c5109fa35efff


>
> But getting BUGs with the default timeout:
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:4:2039]
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2022-05-28 19:24:21

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
<[email protected]> wrote:
> On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
>>
>> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
>> <[email protected]> wrote:
>> > On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
>> >>
>> >> On Fri, 27 May 2022 at 14:55, syzbot
>> >> <[email protected]> wrote:
>> >> >
>> >> > Hello,
>> >> >
>> >> > syzbot found the following issue on:
>> >> >
>> >> > HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
>> >> > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>> >> > console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
>> >> > kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
>> >> > dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
>> >> > compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>> >> > userspace arch: riscv64
>> >> >
>> >> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> >> > Reported-by: [email protected]
>> >>
>> >> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
>> >> processes started crashing with:
>> >>
>> >> 1970/01/01 00:06:55 fuzzer started
>> >> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
>> >> packed=0xffff5908a9400001 -> node=0xffff5908a940
>> >> fatal error: lfstack.push
>> >> runtime stack:
>> >> runtime.throw({0x30884c, 0xc})
>> >> /usr/local/go/src/runtime/panic.go:1198 +0x60
>> >> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
>> >> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
>> >>
>> >> Go runtime tries to shove some data into the upper 16 bits of pointers
>> >> assuming they are unused.
>> >> However, the original pointer node=0xffffff5908a940 suggest riscv now
>> >> has 56-bit users-space address space?
>> >
>> >
>> > Yes, sv57 was merged recently.
>> >
>> >>
>> >> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
>> >> "
>> >> The RISC-V privileged architecture document states that the 64bit addresses
>> >> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
>> >> occur.":
>> >
>> >
>> > Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
>> >
>> > The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
>> >
>> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
>> > index 6dbe9b541f..a64b50ed75 100644
>> > --- a/target/riscv/csr.c
>> > +++ b/target/riscv/csr.c
>> > @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
>> > [VM_1_10_MBARE] = 1,
>> > [VM_1_10_SV39] = 1,
>> > [VM_1_10_SV48] = 1,
>> > - [VM_1_10_SV57] = 1
>> > + [VM_1_10_SV57] = 0
>> > };
>> >
>> > /* Machine Information Registers */
>> >
>> >> ...
>> >> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
>> >> user-space virtual memory, different per mm
>> >> "
>>
>> There is no kernel config to force SV48/39, right?
>
>
> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.

+original CC list

FTR sent Go runtime change to support SV57:
https://go-review.googlesource.com/c/go/+/409055

2022-05-28 19:24:22

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On 5/27/22 19:04, Dmitry Vyukov wrote:
> On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
> <[email protected]> wrote:
>> On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
>>> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
>>> <[email protected]> wrote:
>>>> On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
>>>>> On Fri, 27 May 2022 at 14:55, syzbot
>>>>> <[email protected]> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following issue on:
>>>>>>
>>>>>> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
>>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
>>>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>> userspace arch: riscv64
>>>>>>
>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>> Reported-by: [email protected]
>>>>> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
>>>>> processes started crashing with:
>>>>>
>>>>> 1970/01/01 00:06:55 fuzzer started
>>>>> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
>>>>> packed=0xffff5908a9400001 -> node=0xffff5908a940
>>>>> fatal error: lfstack.push
>>>>> runtime stack:
>>>>> runtime.throw({0x30884c, 0xc})
>>>>> /usr/local/go/src/runtime/panic.go:1198 +0x60
>>>>> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
>>>>> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
>>>>>
>>>>> Go runtime tries to shove some data into the upper 16 bits of pointers
>>>>> assuming they are unused.
>>>>> However, the original pointer node=0xffffff5908a940 suggest riscv now
>>>>> has 56-bit users-space address space?
>>>>
>>>> Yes, sv57 was merged recently.
>>>>
>>>>> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
>>>>> "
>>>>> The RISC-V privileged architecture document states that the 64bit addresses
>>>>> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
>>>>> occur.":
>>>>
>>>> Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
>>>>
>>>> The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
>>>>
>>>> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
>>>> index 6dbe9b541f..a64b50ed75 100644
>>>> --- a/target/riscv/csr.c
>>>> +++ b/target/riscv/csr.c
>>>> @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
>>>> [VM_1_10_MBARE] = 1,
>>>> [VM_1_10_SV39] = 1,
>>>> [VM_1_10_SV48] = 1,
>>>> - [VM_1_10_SV57] = 1
>>>> + [VM_1_10_SV57] = 0
>>>> };
>>>>
>>>> /* Machine Information Registers */
>>>>
>>>>> ...
>>>>> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
>>>>> user-space virtual memory, different per mm
>>>>> "
>>> There is no kernel config to force SV48/39, right?
>>
>> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
> +original CC list
>
> FTR sent Go runtime change to support SV57:
> https://go-review.googlesource.com/c/go/+/409055


Thank you for that, I'll pull that into Ubuntu when merged. Do you know
if any other programming language does the same and would need a fix too?


>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2022-05-28 19:28:21

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Sat, 28 May 2022 at 10:03, Alexandre Ghiti <[email protected]> wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> syzbot found the following issue on:
> >>>>>>
> >>>>>> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
> >>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> >>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
> >>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
> >>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
> >>>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >>>>>> userspace arch: riscv64
> >>>>>>
> >>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>>>> Reported-by: [email protected]
> >>>>> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
> >>>>> processes started crashing with:
> >>>>>
> >>>>> 1970/01/01 00:06:55 fuzzer started
> >>>>> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
> >>>>> packed=0xffff5908a9400001 -> node=0xffff5908a940
> >>>>> fatal error: lfstack.push
> >>>>> runtime stack:
> >>>>> runtime.throw({0x30884c, 0xc})
> >>>>> /usr/local/go/src/runtime/panic.go:1198 +0x60
> >>>>> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
> >>>>> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
> >>>>>
> >>>>> Go runtime tries to shove some data into the upper 16 bits of pointers
> >>>>> assuming they are unused.
> >>>>> However, the original pointer node=0xffffff5908a940 suggest riscv now
> >>>>> has 56-bit users-space address space?
> >>>>
> >>>> Yes, sv57 was merged recently.
> >>>>
> >>>>> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
> >>>>> "
> >>>>> The RISC-V privileged architecture document states that the 64bit addresses
> >>>>> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
> >>>>> occur.":
> >>>>
> >>>> Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
> >>>>
> >>>> The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
> >>>>
> >>>> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> >>>> index 6dbe9b541f..a64b50ed75 100644
> >>>> --- a/target/riscv/csr.c
> >>>> +++ b/target/riscv/csr.c
> >>>> @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
> >>>> [VM_1_10_MBARE] = 1,
> >>>> [VM_1_10_SV39] = 1,
> >>>> [VM_1_10_SV48] = 1,
> >>>> - [VM_1_10_SV57] = 1
> >>>> + [VM_1_10_SV57] = 0
> >>>> };
> >>>>
> >>>> /* Machine Information Registers */
> >>>>
> >>>>> ...
> >>>>> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
> >>>>> user-space virtual memory, different per mm
> >>>>> "
> >>> There is no kernel config to force SV48/39, right?
> >>
> >> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
> > +original CC list
> >
> > FTR sent Go runtime change to support SV57:
> > https://go-review.googlesource.com/c/go/+/409055
>
>
> Thank you for that, I'll pull that into Ubuntu when merged. Do you know
> if any other programming language does the same and would need a fix too?

Nothing comes to mind right now.
But this is not only about language runtimes, it's about all software
out there. However, x86 has 5-level pages now as well, it should stomp
on these problems earlier... but somehow it did not happen for Go
runtime.

2022-05-28 20:09:25

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Sat, 28 May 2022 at 10:09, Alexandre Ghiti <[email protected]> wrote:
>
> On 5/27/22 19:12, Dmitry Vyukov wrote:
> > On Fri, 27 May 2022 at 19:04, Dmitry Vyukov <[email protected]> wrote:
> >> On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
> >> <[email protected]> wrote:
> >>> On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
> >>>> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
> >>>> <[email protected]> wrote:
> >>>>> On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
> >>>>>> On Fri, 27 May 2022 at 14:55, syzbot
> >>>>>> <[email protected]> wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> syzbot found the following issue on:
> >>>>>>>
> >>>>>>> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
> >>>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
> >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
> >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
> >>>>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >>>>>>> userspace arch: riscv64
> >>>>>>>
> >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >>>>>>> Reported-by: [email protected]
> >>>>>> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
> >>>>>> processes started crashing with:
> >>>>>>
> >>>>>> 1970/01/01 00:06:55 fuzzer started
> >>>>>> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
> >>>>>> packed=0xffff5908a9400001 -> node=0xffff5908a940
> >>>>>> fatal error: lfstack.push
> >>>>>> runtime stack:
> >>>>>> runtime.throw({0x30884c, 0xc})
> >>>>>> /usr/local/go/src/runtime/panic.go:1198 +0x60
> >>>>>> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
> >>>>>> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
> >>>>>>
> >>>>>> Go runtime tries to shove some data into the upper 16 bits of pointers
> >>>>>> assuming they are unused.
> >>>>>> However, the original pointer node=0xffffff5908a940 suggest riscv now
> >>>>>> has 56-bit users-space address space?
> >>>>>
> >>>>> Yes, sv57 was merged recently.
> >>>>>
> >>>>>> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
> >>>>>> "
> >>>>>> The RISC-V privileged architecture document states that the 64bit addresses
> >>>>>> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
> >>>>>> occur.":
> >>>>>
> >>>>> Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
> >>>>>
> >>>>> The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
> >>>>>
> >>>>> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> >>>>> index 6dbe9b541f..a64b50ed75 100644
> >>>>> --- a/target/riscv/csr.c
> >>>>> +++ b/target/riscv/csr.c
> >>>>> @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
> >>>>> [VM_1_10_MBARE] = 1,
> >>>>> [VM_1_10_SV39] = 1,
> >>>>> [VM_1_10_SV48] = 1,
> >>>>> - [VM_1_10_SV57] = 1
> >>>>> + [VM_1_10_SV57] = 0
> >>>>> };
> >>>>>
> >>>>> /* Machine Information Registers */
> >>>>>
> >>>>>> ...
> >>>>>> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
> >>>>>> user-space virtual memory, different per mm
> >>>>>> "
> >>>> There is no kernel config to force SV48/39, right?
> >>>
> >>> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
> >> +original CC list
> >>
> >> FTR sent Go runtime change to support SV57:
> >> https://go-review.googlesource.com/c/go/+/409055
> >
> >
> > Is CONFIG_CMDLINE broken on riscv?
> > I am running with:
> >
> > CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0
> > sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
> > nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000
> > nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000
> > nf-conntrack-sane.ports=20000 binder.debug_mask=0
> > rcupdate.rcu_expedited=1 no_hash_pointers page_owner=on
> > sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4
> > secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1
> > msr.allow_writes=off dummy_hcd.num=2 smp.csd_lock_timeout=300000
> > watchdog_thresh=165 workqueue.watchdog_thresh=420
> > sysctl.net.core.netdev_unregister_timeout_secs=420 panic_on_warn=1"
>
>
> This command line is 608-character long, but we are still stuck with the
> default COMMAND_LINE_SIZE to 512, I imagine that it is the problem. I
> had proposed a patch last year to bump that to 1024, but it never got
> merged
> https://lore.kernel.org/lkml/CAEn-LTqTXCEC=bXTvGyo8SNL0JMWRKtiSwQB7R=Pc4uhxZUruA@mail.gmail.com/T/#m4b45019dc0f5573f2a50c1f6007c5109fa35efff


risc-v maintainers, please merge it now.
I would even suggest 2048:

git grep "define COMMAND_LINE_SIZE" arch/
arch/alpha/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/arc/include/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/arm/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
arch/arm64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/ia64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/m68k/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/microblaze/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
arch/mips/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 4096
arch/parisc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
arch/powerpc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/s390/include/asm/setup.h:#define COMMAND_LINE_SIZE CONFIG_COMMAND_LINE_SIZE
arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 2048
arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 256
arch/um/include/asm/setup.h:#define COMMAND_LINE_SIZE 4096
arch/x86/include/asm/setup.h:#define COMMAND_LINE_SIZE 2048
arch/xtensa/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256


It's also interesting how the kernel handles overflow. Imagine one
adds that_critical_security_feature=1 to the end of an existing long
line.

2022-05-28 20:09:59

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Fri, 27 May 2022 at 19:04, Dmitry Vyukov <[email protected]> wrote:
>
> On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
> <[email protected]> wrote:
> > On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
> >>
> >> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
> >> <[email protected]> wrote:
> >> > On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
> >> >>
> >> >> On Fri, 27 May 2022 at 14:55, syzbot
> >> >> <[email protected]> wrote:
> >> >> >
> >> >> > Hello,
> >> >> >
> >> >> > syzbot found the following issue on:
> >> >> >
> >> >> > HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
> >> >> > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> >> >> > console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
> >> >> > kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
> >> >> > dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
> >> >> > compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >> >> > userspace arch: riscv64
> >> >> >
> >> >> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> >> >> > Reported-by: [email protected]
> >> >>
> >> >> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
> >> >> processes started crashing with:
> >> >>
> >> >> 1970/01/01 00:06:55 fuzzer started
> >> >> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
> >> >> packed=0xffff5908a9400001 -> node=0xffff5908a940
> >> >> fatal error: lfstack.push
> >> >> runtime stack:
> >> >> runtime.throw({0x30884c, 0xc})
> >> >> /usr/local/go/src/runtime/panic.go:1198 +0x60
> >> >> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
> >> >> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
> >> >>
> >> >> Go runtime tries to shove some data into the upper 16 bits of pointers
> >> >> assuming they are unused.
> >> >> However, the original pointer node=0xffffff5908a940 suggest riscv now
> >> >> has 56-bit users-space address space?
> >> >
> >> >
> >> > Yes, sv57 was merged recently.
> >> >
> >> >>
> >> >> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
> >> >> "
> >> >> The RISC-V privileged architecture document states that the 64bit addresses
> >> >> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
> >> >> occur.":
> >> >
> >> >
> >> > Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
> >> >
> >> > The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
> >> >
> >> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> >> > index 6dbe9b541f..a64b50ed75 100644
> >> > --- a/target/riscv/csr.c
> >> > +++ b/target/riscv/csr.c
> >> > @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
> >> > [VM_1_10_MBARE] = 1,
> >> > [VM_1_10_SV39] = 1,
> >> > [VM_1_10_SV48] = 1,
> >> > - [VM_1_10_SV57] = 1
> >> > + [VM_1_10_SV57] = 0
> >> > };
> >> >
> >> > /* Machine Information Registers */
> >> >
> >> >> ...
> >> >> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
> >> >> user-space virtual memory, different per mm
> >> >> "
> >>
> >> There is no kernel config to force SV48/39, right?
> >
> >
> > No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
>
> +original CC list
>
> FTR sent Go runtime change to support SV57:
> https://go-review.googlesource.com/c/go/+/409055



Is CONFIG_CMDLINE broken on riscv?
I am running with:

CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0
sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000
nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000
nf-conntrack-sane.ports=20000 binder.debug_mask=0
rcupdate.rcu_expedited=1 no_hash_pointers page_owner=on
sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4
secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1
msr.allow_writes=off dummy_hcd.num=2 smp.csd_lock_timeout=300000
watchdog_thresh=165 workqueue.watchdog_thresh=420
sysctl.net.core.netdev_unregister_timeout_secs=420 panic_on_warn=1"

But getting BUGs with the default timeout:
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:4:2039]

2022-05-28 20:12:37

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Fri, 27 May 2022 at 14:55, syzbot
<[email protected]> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> userspace arch: riscv64
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]

The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
processes started crashing with:

1970/01/01 00:06:55 fuzzer started
runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
packed=0xffff5908a9400001 -> node=0xffff5908a940
fatal error: lfstack.push
runtime stack:
runtime.throw({0x30884c, 0xc})
/usr/local/go/src/runtime/panic.go:1198 +0x60
runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
/usr/local/go/src/runtime/lfstack.go:30 +0x1a8

Go runtime tries to shove some data into the upper 16 bits of pointers
assuming they are unused.
However, the original pointer node=0xffffff5908a940 suggest riscv now
has 56-bit users-space address space?
Documentation/riscv/vm-layout.rst claims 48-bit pointers:
"
The RISC-V privileged architecture document states that the 64bit addresses
"must have bits 63–48 all equal to bit 47, or else a page-fault exception will
occur.":
...
0000000000000000 | 0 | 0000003fffffffff | 256 GB |
user-space virtual memory, different per mm
"

2022-08-04 06:26:36

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] riscv/fixes test error: lost connection to test machine

On Tue, 31 May 2022 at 16:10, Alexandre Ghiti
<[email protected]> wrote:
> On Sat, May 28, 2022 at 10:31 AM Dmitry Vyukov <[email protected]> wrote:
>>
>> On Sat, 28 May 2022 at 10:09, Alexandre Ghiti <[email protected]> wrote:
>> >
>> > On 5/27/22 19:12, Dmitry Vyukov wrote:
>> > > On Fri, 27 May 2022 at 19:04, Dmitry Vyukov <[email protected]> wrote:
>> > >> On Fri, 27 May 2022 at 16:01, Alexandre Ghiti
>> > >> <[email protected]> wrote:
>> > >>> On Friday, May 27, 2022 at 3:55:24 PM UTC+2 Dmitry Vyukov wrote:
>> > >>>> On Fri, 27 May 2022 at 15:50, Alexandre Ghiti
>> > >>>> <[email protected]> wrote:
>> > >>>>> On Friday, May 27, 2022 at 3:02:01 PM UTC+2 Dmitry Vyukov wrote:
>> > >>>>>> On Fri, 27 May 2022 at 14:55, syzbot
>> > >>>>>> <[email protected]> wrote:
>> > >>>>>>> Hello,
>> > >>>>>>>
>> > >>>>>>> syzbot found the following issue on:
>> > >>>>>>>
>> > >>>>>>> HEAD commit: c932edeaf6d6 riscv: dts: microchip: fix gpio1 reg property..
>> > >>>>>>> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>> > >>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1418add5f00000
>> > >>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=aa6b5702bdf14a17
>> > >>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2c5da6a0a16a0c4f34aa
>> > >>>>>>> compiler: riscv64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>> > >>>>>>> userspace arch: riscv64
>> > >>>>>>>
>> > >>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> > >>>>>>> Reported-by: [email protected]
>> > >>>>>> The CONFIG_KASAN_VMALLOC allows riscv kernel to boot, but now Go
>> > >>>>>> processes started crashing with:
>> > >>>>>>
>> > >>>>>> 1970/01/01 00:06:55 fuzzer started
>> > >>>>>> runtime: lfstack.push invalid packing: node=0xffffff5908a940 cnt=0x1
>> > >>>>>> packed=0xffff5908a9400001 -> node=0xffff5908a940
>> > >>>>>> fatal error: lfstack.push
>> > >>>>>> runtime stack:
>> > >>>>>> runtime.throw({0x30884c, 0xc})
>> > >>>>>> /usr/local/go/src/runtime/panic.go:1198 +0x60
>> > >>>>>> runtime.(*lfstack).push(0xdb3850, 0xffffff5908a940)
>> > >>>>>> /usr/local/go/src/runtime/lfstack.go:30 +0x1a8
>> > >>>>>>
>> > >>>>>> Go runtime tries to shove some data into the upper 16 bits of pointers
>> > >>>>>> assuming they are unused.
>> > >>>>>> However, the original pointer node=0xffffff5908a940 suggest riscv now
>> > >>>>>> has 56-bit users-space address space?
>> > >>>>>
>> > >>>>> Yes, sv57 was merged recently.
>> > >>>>>
>> > >>>>>> Documentation/riscv/vm-layout.rst claims 48-bit pointers:
>> > >>>>>> "
>> > >>>>>> The RISC-V privileged architecture document states that the 64bit addresses
>> > >>>>>> "must have bits 63–48 all equal to bit 47, or else a page-fault exception will
>> > >>>>>> occur.":
>> > >>>>>
>> > >>>>> Thanks for pointing that, I extracted that from the specification before sv57 was specified, I'll fix that.
>> > >>>>>
>> > >>>>> The current kernel code will use sv57 as it is supported and advertised by qemu, and to my knowledge, you can't downgrade to sv48 unless by re-compiling qemu using the following:
>> > >>>>>
>> > >>>>> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
>> > >>>>> index 6dbe9b541f..a64b50ed75 100644
>> > >>>>> --- a/target/riscv/csr.c
>> > >>>>> +++ b/target/riscv/csr.c
>> > >>>>> @@ -637,7 +637,7 @@ static const char valid_vm_1_10_64[16] = {
>> > >>>>> [VM_1_10_MBARE] = 1,
>> > >>>>> [VM_1_10_SV39] = 1,
>> > >>>>> [VM_1_10_SV48] = 1,
>> > >>>>> - [VM_1_10_SV57] = 1
>> > >>>>> + [VM_1_10_SV57] = 0
>> > >>>>> };
>> > >>>>>
>> > >>>>> /* Machine Information Registers */
>> > >>>>>
>> > >>>>>> ...
>> > >>>>>> 0000000000000000 | 0 | 0000003fffffffff | 256 GB |
>> > >>>>>> user-space virtual memory, different per mm
>> > >>>>>> "
>> > >>>> There is no kernel config to force SV48/39, right?
>> > >>>
>> > >>> No, we rely on what the hardware advertises, if it supports sv57, we'll go for sv57, if not, we'll try sv48...etc. I had some patches to force the downgrade by using the device tree but they never got merged though.
>> > >> +original CC list
>> > >>
>> > >> FTR sent Go runtime change to support SV57:
>> > >> https://go-review.googlesource.com/c/go/+/409055
>> > >
>> > >
>> > > Is CONFIG_CMDLINE broken on riscv?
>> > > I am running with:
>> > >
>> > > CONFIG_CMDLINE="earlyprintk=serial net.ifnames=0
>> > > sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb
>> > > nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000
>> > > nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000
>> > > nf-conntrack-sane.ports=20000 binder.debug_mask=0
>> > > rcupdate.rcu_expedited=1 no_hash_pointers page_owner=on
>> > > sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4
>> > > secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1
>> > > msr.allow_writes=off dummy_hcd.num=2 smp.csd_lock_timeout=300000
>> > > watchdog_thresh=165 workqueue.watchdog_thresh=420
>> > > sysctl.net.core.netdev_unregister_timeout_secs=420 panic_on_warn=1"
>> >
>> >
>> > This command line is 608-character long, but we are still stuck with the
>> > default COMMAND_LINE_SIZE to 512, I imagine that it is the problem. I
>> > had proposed a patch last year to bump that to 1024, but it never got
>> > merged
>> > https://lore.kernel.org/lkml/CAEn-LTqTXCEC=bXTvGyo8SNL0JMWRKtiSwQB7R=Pc4uhxZUruA@mail.gmail.com/T/#m4b45019dc0f5573f2a50c1f6007c5109fa35efff
>>
>>
>> risc-v maintainers, please merge it now.
>> I would even suggest 2048:
>>
>> git grep "define COMMAND_LINE_SIZE" arch/
>> arch/alpha/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
>> arch/arc/include/asm/setup.h:#define COMMAND_LINE_SIZE 256
>> arch/arm/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
>> arch/arm64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
>> arch/ia64/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
>> arch/m68k/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
>> arch/microblaze/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
>> arch/mips/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 4096
>> arch/parisc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 1024
>> arch/powerpc/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 2048
>> arch/s390/include/asm/setup.h:#define COMMAND_LINE_SIZE CONFIG_COMMAND_LINE_SIZE
>> arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 2048
>> arch/sparc/include/uapi/asm/setup.h:# define COMMAND_LINE_SIZE 256
>> arch/um/include/asm/setup.h:#define COMMAND_LINE_SIZE 4096
>> arch/x86/include/asm/setup.h:#define COMMAND_LINE_SIZE 2048
>> arch/xtensa/include/uapi/asm/setup.h:#define COMMAND_LINE_SIZE 256
>>
>>
>> It's also interesting how the kernel handles overflow. Imagine one
>> adds that_critical_security_feature=1 to the end of an existing long
>> line.
>
>
> Your comment rang a bell and I searched in my old patchsets: I had submitted a patch [1] to output a warning in case of an overflow and to correctly truncate the command line to avoid such issues: it was taken with another series [2] which was actually never merged...My bad on this one, I followed my patch in the series but not the series itself.
>
> I'll try to re-submit it because I agree the current behaviour is really wrong.
>
> [1] https://lore.kernel.org/lkml/[email protected]/T/
> [2] https://lore.kernel.org/linux-devicetree/c3d52a6e1423d9d27c59ad7ab945929b09f74866.1617375802.git.christophe.leroy@csgroup.eu/T/


FTR I've merged the Go fix for SV57:
https://go-review.googlesource.com/c/go/+/409055
but it will only appear in Go 1.20.

And we still have the command line length issue for reviving syzbot testing.