Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1447506pxf; Fri, 12 Mar 2021 09:40:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJxxoteID7charXraLvSSuDLaB0w1+FMNudh8jJOvQ3guZWVtoZvXbsBE7l4mo9yhjXwmQVV X-Received: by 2002:a17:906:4bce:: with SMTP id x14mr9630796ejv.383.1615570852851; Fri, 12 Mar 2021 09:40:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615570852; cv=none; d=google.com; s=arc-20160816; b=TewgltiCswWlWTQjQ1Ak2B+YWJxdF4RbcU0V0AFhRaNMH7Cjm1ZYJpv7NRgcVQtDQi 8eIwFV6JEcLrlTfFuiBF1tosifQ/P1jqUMxA/qIsvqvl/opU6KQQkrx0/hRdV3rsfK+1 fSILELVpv+dylXFvoKUhnsWuy1UEF7hx9tcvszESOhxjvLkK36pMNwfyJRLNzFu/2dXw nXud0tvkjBy1yk8lXjTFqa7UV/KgkFkrL04TRP1BbUThbYwW5b6NQJS3dRsALajJ7SCf tnOctfHmMKKoOqGYjrZn3hjRYcoGqFgHBRR+W+KihXBwJe4YDaCKjD/ZqM3fS4PCR4X1 jr3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=hUsuqONJ3tLWg4HG8dBP3aoESycl/MjRDR9WhFydjCU=; b=C12JyHztuvRH5/knp2cdCTi47BMQaSrjNGjdfKyiluVucpZN3ewwPZVHo5z8coyuP6 vMjHv65/lThkWnG/VNgOuk/Qq53BpSMAtDHCVGZxupPBKF/jYKWtzjHQykrvmKC9lTSg uo/589iwt1GWRZ2HK0ST7O5xzcSPLCIX8ahFFFqqZ+/sg3/hTkI4ZckzggdBpRGOnih4 nq9kXIhIAE+CsMDAt6cHZxxM5IJ85gJWSIH4P2v7oEbM0ZXAuPMnZlSgclCbzcKdeMu3 e4LNe99d+M6SmLuvFpq+h3RDckKc/JSV/y9VuyrDJyEUvBweuoVh1BfDOWpPi+uOpes+ ooQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=s+4H5EkR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h20si4321796ejy.416.2021.03.12.09.40.30; Fri, 12 Mar 2021 09:40:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=s+4H5EkR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231557AbhCLRjc (ORCPT + 99 others); Fri, 12 Mar 2021 12:39:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232229AbhCLRjK (ORCPT ); Fri, 12 Mar 2021 12:39:10 -0500 Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11277C061574 for ; Fri, 12 Mar 2021 09:39:09 -0800 (PST) Received: by mail-qt1-x82b.google.com with SMTP id 6so4347890qty.3 for ; Fri, 12 Mar 2021 09:39:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hUsuqONJ3tLWg4HG8dBP3aoESycl/MjRDR9WhFydjCU=; b=s+4H5EkRJWTHhHou67iB/FJQCS5V6gf5A0f3iS6lNfbnCCI4T9qAs5PZG+QTe2pv2y DlsuUto9fIaMlwZWS3SNOj/6anv3/19RDA05tnI/YB7pV7/B4T0If8KFs5h8XUG8KZNi ohw+n3WHmGltLC9bBlwbOFnHk8uEzTr/Df27P8S0XteU/nRikoLjmP8qgaGoJmtf17es GQLcMbrOw7yOl5TK2Q1PkxnH538XMZ9E7ksssfrQMKl+tYb+5aSxtEbUOMrZPrh5Bb3C MhHnQIgYXRj7IyFtgr7RqA79aKgP2mHaZg8Uf89XYmUKE1VGwNnt+W2gtlOwJIOPfTzv GOvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hUsuqONJ3tLWg4HG8dBP3aoESycl/MjRDR9WhFydjCU=; b=btwCmdoYoqyT5Jzgtfx4YdnpfmrD2HnzoJeQRYEqpHiRYJ5ZYMyu7a1SPe2f5li2Jv oLT/UZrP0zItiVj+doK3Hi5YKVg+5wQq+y1C1vO0n0gyeJ8ILKUbY4D5wuXXYJHGVkDc ThRNK8zMBOf0uj+E98GjRi+nUxrCmmo65FrnHHsXYinhrY9oraX0mVIVsaKizyHIfbVV Sf+cYX9+ANlhS+VA+lyww4vol6EKq07XmX8tLWfH3rvHHE44Pqcp1DPZfeDDmCIUhhkc 24lSsM2TZ0NJRTh182D5qSq02+ksbjdR9kPmb4qXnIX4hrpXure5W0Q4oEIi+Mj+2kjA 4M3Q== X-Gm-Message-State: AOAM531axKfGV0CqxiQL5CEFbk+NS3qIkMilKz8z85mQ7JZPaGcb6Pf/ f0+wFeejhKutuhCKjYDIzz1B5tGlnHGYLpyt5au34Q== X-Received: by 2002:ac8:5212:: with SMTP id r18mr12569666qtn.290.1615570748783; Fri, 12 Mar 2021 09:39:08 -0800 (PST) MIME-Version: 1.0 References: <000000000000b74f1b05bd316729@google.com> <84b0471d-42c1-175f-ae1d-a18c310c7f77@codethink.co.uk> <816870e9-9354-ffbd-936b-40e38e4276a4@codethink.co.uk> <4ce57c7e-6e5d-d136-0a81-395a4207ba44@codethink.co.uk> In-Reply-To: From: Dmitry Vyukov Date: Fri, 12 Mar 2021 18:38:57 +0100 Message-ID: Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail To: Ben Dooks Cc: syzbot , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Daniel Bristot de Oliveira , Benjamin Segall , dietmar.eggemann@arm.com, Juri Lelli , LKML , Mel Gorman , Ingo Molnar , Peter Zijlstra , Steven Rostedt , syzkaller-bugs , Vincent Guittot Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 12, 2021 at 6:34 PM Dmitry Vyukov wrote: > > On Fri, Mar 12, 2021 at 5:36 PM Ben Dooks wrote: > > > > On 12/03/2021 16:34, Ben Dooks wrote: > > > On 12/03/2021 16:30, Ben Dooks wrote: > > >> On 12/03/2021 15:12, Dmitry Vyukov wrote: > > >>> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks > > >>> wrote: > > >>>> > > >>>> On 10/03/2021 17:16, Dmitry Vyukov wrote: > > >>>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot > > >>>>> wrote: > > >>>>>> > > >>>>>> Hello, > > >>>>>> > > >>>>>> syzbot found the following issue on: > > >>>>>> > > >>>>>> HEAD commit: 0d7588ab riscv: process: Fix no prototype for > > >>>>>> arch_dup_tas.. > > >>>>>> git tree: > > >>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes > > >>>>>> console output: > > >>>>>> https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000 > > >>>>>> kernel config: > > >>>>>> https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136 > > >>>>>> dashboard link: > > >>>>>> https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69 > > >>>>>> userspace arch: riscv64 > > >>>>>> > > >>>>>> Unfortunately, I don't have any reproducer for this issue yet. > > >>>>>> > > >>>>>> IMPORTANT: if you fix the issue, please add the following tag to > > >>>>>> the commit: > > >>>>>> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com > > >>>>> > > >>>>> +riscv maintainers > > >>>>> > > >>>>> This is riscv64-specific. > > >>>>> I've seen similar crashes in put_user in other places. It looks like > > >>>>> put_user crashes in the user address is not mapped/protected (?). > > >>>> > > >>>> I've been having a look, and this seems to be down to access of the > > >>>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a > > >>>> bad address to clone? > > >>>> > > >>>> From looking at the code, the put_user() code should have set the > > >>>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the > > >>>> s2 register in the crash report) and from looking at the compiler > > >>>> output from my gcc-10, the code looks to be dong the relevant csrs > > >>>> and then csrc around the put_user > > >>>> > > >>>> So currently I do not understand how the above could have happened > > >>>> over than something re-tried the code seqeunce and ended up retrying > > >>>> the faulting instruction without the SR_SUM bit set. > > >>> > > >>> I would maybe blame qemu for randomly resetting SR_SUM, but it's > > >>> strange that 99% of these crashes are in schedule_tail. If it would be > > >>> qemu, then they would be more evenly distributed... > > >>> > > >>> Another observation: looking at a dozen of crash logs, in none of > > >>> these cases fuzzer was actually trying to fuzz clone with some insane > > >>> arguments. So it looks like completely normal clone's (e..g coming > > >>> from pthread_create) result in this crash. > > >>> > > >>> I also wonder why there is ret_from_exception, is it normal? I see > > >>> handle_exception disables SR_SUM: > > >>> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73 > > >>> > > >> > > >> So I think if SR_SUM is set, then it faults the access to user memory > > >> which the _user() routines clear to allow them access. > > >> > > >> I'm thinking there is at least one issue here: > > >> > > >> - the test in fault is the wrong way around for die kernel > > >> - the handler only catches this if the page has yet to be mapped. > > >> > > >> So I think the test should be: > > >> > > >> if (!user_mode(regs) && addr < TASK_SIZE && > > >> unlikely(regs->status & SR_SUM) > > >> > > >> This then should continue on and allow the rest of the handler to > > >> complete mapping the page if it is not there. > > >> > > >> I have been trying to create a very simple clone test, but so far it > > >> has yet to actually trigger anything. > > > > > > I should have added there doesn't seem to be a good way to use mmap() > > > to allocate memory but not insert a vm-mapping post the mmap(). > > > > > How difficult is it to try building a branch with the above test > > modified? > > I don't have access to hardware, I don't have other qemu versions ready to use. > But I can teach you how to run syzkaller locally :) > I am not sure anybody run it on real riscv hardware at all. When > Tobias ported syzkaller, Tobias also used qemu I think. > > I am now building with an inverted check to test locally. > > I don't fully understand but this code, but does handle_exception > reset SR_SUM around do_page_fault? If so, then looking at SR_SUM in > do_page_fault won't work with positive nor negative check. The inverted check crashes during boot: --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -249,7 +249,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs) flags |= FAULT_FLAG_USER; if (!user_mode(regs) && addr < TASK_SIZE && - unlikely(!(regs->status & SR_SUM))) + unlikely(regs->status & SR_SUM)) die_kernel_fault("access to user memory without uaccess routines", addr, regs); [ 77.349329][ T1] Run /sbin/init as init process [ 77.868371][ T1] Unable to handle kernel access to user memory without uaccess routines at virtual address 00000000000e8e39 [ 77.870355][ T1] Oops [#1] [ 77.870766][ T1] Modules linked in: [ 77.871326][ T1] CPU: 0 PID: 1 Comm: init Not tainted 5.12.0-rc2-00010-g0d7588ab9ef9-dirty #42 [ 77.872057][ T1] Hardware name: riscv-virtio,qemu (DT) [ 77.872620][ T1] epc : __clear_user+0x36/0x4e [ 77.873285][ T1] ra : padzero+0x9c/0xb0 [ 77.873849][ T1] epc : ffffffe000bb7136 ra : ffffffe0004f42a0 sp : ffffffe006f8fbc0 [ 77.874438][ T1] gp : ffffffe005d25718 tp : ffffffe006f98000 t0 : 00000000000e8e40 [ 77.875031][ T1] t1 : 00000000000e9000 t2 : 000000000001c49c s0 : ffffffe006f8fbf0 [ 77.875618][ T1] s1 : 00000000000001c7 a0 : 00000000000e8e39 a1 : 00000000000001c7 [ 77.876204][ T1] a2 : 0000000000000002 a3 : 00000000000e9000 a4 : ffffffe006f99000 [ 77.876787][ T1] a5 : 0000000000000000 a6 : 0000000000f00000 a7 : ffffffe00031c088 [ 77.877367][ T1] s2 : 00000000000e8e39 s3 : 0000000000001000 s4 : 0000003ffffffe39 [ 77.877952][ T1] s5 : 00000000000e8e39 s6 : 00000000000e9570 s7 : 00000000000e8e39 [ 77.878535][ T1] s8 : 0000000000000001 s9 : 00000000000e8e39 s10: ffffffe00c65f608 [ 77.879126][ T1] s11: ffffffe00816e8d8 t3 : ea3af0fa372b8300 t4 : 0000000000000003 [ 77.879711][ T1] t5 : ffffffc401dc45d8 t6 : 0000000000040000 [ 77.880209][ T1] status: 0000000000040120 badaddr: 00000000000e8e39 cause: 000000000000000f [ 77.880846][ T1] Call Trace: [ 77.881213][ T1] [] __clear_user+0x36/0x4e [ 77.881912][ T1] [] load_elf_binary+0xf8a/0x2400 [ 77.882562][ T1] [] bprm_execve+0x5b0/0x1080 [ 77.883145][ T1] [] kernel_execve+0x204/0x288 [ 77.883727][ T1] [] run_init_process+0x1fe/0x212 [ 77.884337][ T1] [] try_to_run_init_process+0x1e/0x66 [ 77.884956][ T1] [] kernel_init+0x14a/0x200 [ 77.885541][ T1] [] ret_from_exception+0x0/0x14 [ 77.886955][ T1] ---[ end trace 1e934d07b8a4bed8 ]--- [ 77.887705][ T1] Kernel panic - not syncing: Fatal exception [ 77.888333][ T1] SMP: stopping secondary CPUs [ 77.889357][ T1] Rebooting in 86400 seconds..