Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp7107997pxb; Thu, 18 Feb 2021 01:21:40 -0800 (PST) X-Google-Smtp-Source: ABdhPJw+ZF9N/5nceeQEfJmkrDgJVx6v0fSf+wkdFZs4bn4ZxrcXfK8UrCo3cCbWQsbZ+CznSJNf X-Received: by 2002:aa7:d35a:: with SMTP id m26mr3215759edr.292.1613640100277; Thu, 18 Feb 2021 01:21:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613640100; cv=none; d=google.com; s=arc-20160816; b=fFtBzs8nNdT2c+yCUQjU+/0pvNxcIGP/8NsOVa1ovgOG/xV4qSjsH2yXOdf46a+Nj4 xXq3UtSOgBgbVVTDW4KR617rz8gbqVyhdFKFAxNci8khYB8WiphPboudq47cbahfC5Vh kaJ2lZQ4OA2vMIobyklOmXVhNlc2A+pTZoCBNRuNlK+pkdVLYb8bcyBTI7pfAcP5pL1I DA/sli87NzQsSuarZV2IaKasHfrDoP/DaA/BnxcUezQpfWGk74iE4iRa1wI3AhipszCq tlh9erG/KpLB/ljH/3rB/0FVCYv5LtfLv7PekxkGMSOVZ7yn+fmMeNvOAOzUBtnp3Ak/ wRrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=u65NVLI0x+6Dj6gcTE6eBvRu2nPacKFXnhhfQ0V3Xew=; b=vah7HknE7q08LZ+4sS3h13dU7FeZEFyUyXzukQ9bYgEd6k8Ef8ajHo+VfqstOCDIav RazvKin5flgZtore5AUuV4tZvxtXOmpp+pwpx9fy+aDKNFP2VyOnrDhUebcvsCblLy5s KDgFHRSSMpPdl5LakcAmxGCKPwTwIZoEch6OlqlabS1GtaYb2HbEl7RZ1uj51HnAbNIT pu0DQjpqjigBvwRrxhL6GPfOYI52ufMyKAqkANWwMIgQs7knfV29brG2WoqcKtHZbS3/ hnFA2Nzj7Dp4dAy66XMA1BA9gZeJ2LlYkLLvPve2Kd6lYx0oSJQjk+m2Ta1zscPw4vcV orQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f17si3144134ejd.597.2021.02.18.01.21.04; Thu, 18 Feb 2021 01:21:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231474AbhBRJCJ (ORCPT + 99 others); Thu, 18 Feb 2021 04:02:09 -0500 Received: from relay11.mail.gandi.net ([217.70.178.231]:43905 "EHLO relay11.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231511AbhBRHzb (ORCPT ); Thu, 18 Feb 2021 02:55:31 -0500 Received: from [192.168.1.12] (lfbn-lyo-1-457-219.w2-7.abo.wanadoo.fr [2.7.49.219]) (Authenticated sender: alex@ghiti.fr) by relay11.mail.gandi.net (Postfix) with ESMTPSA id 27C16100009; Thu, 18 Feb 2021 07:54:06 +0000 (UTC) Subject: Re: riscv+KASAN does not boot To: Dmitry Vyukov Cc: Albert Ou , Bjorn Topel , Palmer Dabbelt , LKML , nylon7@andestech.com, syzkaller , Andreas Schwab , Paul Walmsley , Tobias Klauser , linux-riscv References: <20210118145310.crnqnh6kax5jqicj@distanz.ch> <6e9ee3a1-0e16-b1fc-a690-f1ca8e9823a5@ghiti.fr> <24857bfc-c557-f141-8ae7-2e3da24f67f5@ghiti.fr> <957f09fb-84f4-2e0a-13ab-f7e4831ee7d0@ghiti.fr> From: Alex Ghiti Message-ID: Date: Thu, 18 Feb 2021 02:54:06 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: fr Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Dmitry, > On Wed, Feb 17, 2021 at 5:36 PM Alex Ghiti wrote: >> >> Le 2/16/21 à 11:42 PM, Dmitry Vyukov a écrit : >>> On Tue, Feb 16, 2021 at 9:42 PM Alex Ghiti wrote: >>>> >>>> Hi Dmitry, >>>> >>>> Le 2/16/21 à 6:25 AM, Dmitry Vyukov a écrit : >>>>> On Tue, Feb 16, 2021 at 12:17 PM Dmitry Vyukov wrote: >>>>>> >>>>>> On Fri, Jan 29, 2021 at 9:11 AM Dmitry Vyukov wrote: >>>>>>>> I was fixing KASAN support for my sv48 patchset so I took a look at your >>>>>>>> issue: I built a kernel on top of the branch riscv/fixes using >>>>>>>> https://github.com/google/syzkaller/blob/269d24e857a757d09a898086a2fa6fa5d827c3e1/dashboard/config/linux/upstream-riscv64-kasan.config >>>>>>>> and Buildroot 2020.11. I have the warnings regarding the use of >>>>>>>> __virt_to_phys on wrong addresses (but that's normal since this function >>>>>>>> is used in virt_addr_valid) but not the segfaults you describe. >>>>>>> >>>>>>> Hi Alex, >>>>>>> >>>>>>> Let me try to rebuild buildroot image. Maybe there was something wrong >>>>>>> with my build, though, I did 'make clean' before doing. But at the >>>>>>> same time it worked back in June... >>>>>>> >>>>>>> Re WARNINGs, they indicate kernel bugs. I am working on setting up a >>>>>>> syzbot instance on riscv. If there a WARNING during boot then the >>>>>>> kernel will be marked as broken. No further testing will happen. >>>>>>> Is it a mis-use of WARN_ON? If so, could anybody please remove it or >>>>>>> replace it with pr_err. >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I've localized one issue with riscv/KASAN: >>>>>> KASAN breaks VDSO and that's I think the root cause of weird faults I >>>>>> saw earlier. The following patch fixes it. >>>>>> Could somebody please upstream this fix? I don't know how to add/run >>>>>> tests for this. >>>>>> Thanks >>>>>> >>>>>> diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile >>>>>> index 0cfd6da784f84..cf3a383c1799d 100644 >>>>>> --- a/arch/riscv/kernel/vdso/Makefile >>>>>> +++ b/arch/riscv/kernel/vdso/Makefile >>>>>> @@ -35,6 +35,7 @@ CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os >>>>>> # Disable gcov profiling for VDSO code >>>>>> GCOV_PROFILE := n >>>>>> KCOV_INSTRUMENT := n >>>>>> +KASAN_SANITIZE := n >>>>>> >>>>>> # Force dependency >>>>>> $(obj)/vdso.o: $(obj)/vdso.so >>>> >>>> What's weird is that I don't have any issue without this patch with the >>>> following config whereas it indeed seems required for KASAN. But when >>>> looking at the segfaults you got earlier, the segfault address is 0xbb0 >>>> and the cause is an instruction page fault: this address is the PLT base >>>> address in vdso.so and an instruction page fault would mean that someone >>>> tried to jump at this address, which is weird. At first sight, that does >>>> not seem related to your patch above, but clearly I may be wrong. >>>> >>>> Tobias, did you observe the same segfaults as Dmitry ? >>> >>> >>> I noticed that not all buildroot images use VDSO, it seems to be >>> dependent on libc settings (at least I think I changed it in the >>> past). >> >> Ok, I used uClibc but then when using glibc, I have the same segfaults, >> only when KASAN is enabled. And your patch fixes the problem. I will try >> to take a look later to better understand the problem. >> >>> I also booted an image completely successfully including dhcpd/sshd >>> start, but then my executable crashed in clock_gettime. The executable >>> was build on linux/amd64 host with "riscv64-linux-gnu-gcc -static" >>> (10.2.1). >>> >>> >>>>> Second issue I am seeing seems to be related to text segment size. >>>>> I check out v5.11 and use this config: >>>>> https://gist.github.com/dvyukov/6af25474d455437577a84213b0cc9178 >>>> >>>> This config gave my laptop a hard time ! Finally I was able to boot >>>> correctly to userspace, but I realized I used my sv48 branch...Either I >>>> fixed your issue along the way or I can't reproduce it, I'll give it a >>>> try tomorrow. >>> >>> Where is your branch? I could also test in my setup on your branch. >>> >> >> You can find my branch int/alex/riscv_kernel_end_of_address_space_v2 >> here: https://github.com/AlexGhiti/riscv-linux.git > > No, it does not work for me. > > Source is on b61ab6c98de021398cd7734ea5fc3655e51e70f2 (HEAD, > int/alex/riscv_kernel_end_of_address_space_v2) > Config is https://gist.githubusercontent.com/dvyukov/6af25474d455437577a84213b0cc9178/raw/55b116522c14a8a98a7626d76df740d54f648ce5/gistfile1.txt > > riscv64-linux-gnu-gcc -v > gcc version 10.2.1 20210110 (Debian 10.2.1-6+build1) > > qemu-system-riscv64 --version > QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-3) > > qemu-system-riscv64 \ > -machine virt -smp 2 -m 2G \ > -device virtio-blk-device,drive=hd0 \ > -drive file=image-riscv64,if=none,format=raw,id=hd0 \ > -kernel arch/riscv/boot/Image \ > -nographic \ > -device virtio-rng-device,rng=rng0 -object > rng-random,filename=/dev/urandom,id=rng0 \ > -netdev user,id=net0,host=10.0.2.10,hostfwd=tcp::10022-:22 -device > virtio-net-device,netdev=net0 \ > -append "root=/dev/vda earlyprintk=serial console=ttyS0 oops=panic > panic_on_warn=1 panic=86400 earlycon" It still works for me but I had to disable CONFIG_DEBUG_INFO_BTF (I don't think that changes anything at runtime). But your above command line does not work for me as it appears you do not load any firmware, if I add -bios images/fw_jump.elf, it works. But then I don't know where your opensbi output below comes from... And regarding your issue with calling clock_gettime 'directly' compared to using the syscall, I have the same consistent output from both calls. I have an older gcc (9.3.0) and the same qemu. I think what is missing here is your buildroot config, so that we have the exact same environment: could you post your buildroot config as well ? Thanks, > > OpenSBI v0.8 > ____ _____ ____ _____ > / __ \ / ____| _ \_ _| > | | | |_ __ ___ _ __ | (___ | |_) || | > | | | | '_ \ / _ \ '_ \ \___ \| _ < | | > | |__| | |_) | __/ | | |____) | |_) || |_ > \____/| .__/ \___|_| |_|_____/|____/_____| > | | > |_| > > Platform Name : riscv-virtio,qemu > Platform Features : timer,mfdeleg > Platform HART Count : 2 > Boot HART ID : 1 > Boot HART ISA : rv64imafdcsu > BOOT HART Features : pmp,scounteren,mcounteren,time > BOOT HART PMP Count : 16 > Firmware Base : 0x80000000 > Firmware Size : 104 KB > Runtime SBI Version : 0.2 > > MIDELEG : 0x0000000000000222 > MEDELEG : 0x000000000000b109 > PMP0 : 0x0000000080000000-0x000000008001ffff (A)OpenSBI v0.6 > > > no output after this > PMP1 : 0x0000000000000000-0xffffffffffffffff (A,R,W,X) > > > >> Thanks, >> >>> >>>>> Then trying to boot it using: >>>>> QEMU emulator version 5.2.0 (Debian 1:5.2+dfsg-3) >>>>> $ qemu-system-riscv64 -machine virt -smp 2 -m 4G ... >>>>> >>>>> It shows no output from the kernel whatsoever, even though I have >>>>> earlycon and output shows very early with other configs. >>>>> Kernel boots fine with defconfig and other smaller configs. >>>>> >>>>> If I enable KASAN_OUTLINE and CC_OPTIMIZE_FOR_SIZE, then this config >>>>> also boots fine. Both of these options significantly reduce kernel >>>>> size. However, I can also boot the kernel without these 2 configs, if >>>>> I disable a whole lot of subsystem configs. This makes me think that >>>>> there is an issue related to kernel size somewhere in >>>>> qemu/bootloader/kernel bootstrap code. >>>>> Does it make sense to you? Can somebody reproduce what I am seeing? > >>>> >>>> I did not bring any answer to your question, but at least you know I'm >>>> working on it, I'll keep you posted. >>>> >>>> Thanks for taking the time to setup syzkaller. >>>> >>>> Alex >>>> >>>>> Thanks >>>>> >>>>> _______________________________________________ >>>>> linux-riscv mailing list >>>>> linux-riscv@lists.infradead.org >>>>> http://lists.infradead.org/mailman/listinfo/linux-riscv >>>>> >>> >>> _______________________________________________ >>> linux-riscv mailing list >>> linux-riscv@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-riscv >>> > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv >