Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1445840pxf; Fri, 12 Mar 2021 09:38:39 -0800 (PST) X-Google-Smtp-Source: ABdhPJzBLu0xrqr30K7+aqkWIqSXLq+KowpouzIbJe5S/wJU/98oM434+0eHfAD8wAX88RodJEdP X-Received: by 2002:aa7:c4d1:: with SMTP id p17mr15625906edr.387.1615570719597; Fri, 12 Mar 2021 09:38:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615570719; cv=none; d=google.com; s=arc-20160816; b=TonncBjxPiueRYa9JPdpwDiEozcFrka05+kl8ulEePVr+9swMGSLAsXlCD8ihfgsHh 0sIYOHt5TcWMwPqBUnJnbInJ/+cIJ8Q56i+62OUFeNObD0FPMLL8LWmtIKKn6D+kg1kF 79Ij7tN9nQKjmhbxMGhxhmqsCF0NDK6YjC+7aXukBUU0DCsHRxfq7oS3C7AW3ZrEj2oj GlGQSOswUc/r+ocvO7JhYlRwMi3Yrnens13lOsePU14dnge8AAWLgHVmap1nhrb7CsQ4 D3r67qd5m85KjZb0OgqeeATiVTOA8tV2SLGh80rkcoOcwSesgiPUVnMNmrP6pn/9GaWP /t+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=0hmXxAjdRD0U+F/dWy2tB3crefQXhKAVgphSKNgBkZI=; b=E1Ftj4fdyz6WlwJNBMb5oSwZq+reRxoYWnOeLn4CqGO+IwZ3ociEiyZujeJGW6zuTy cjyUg1W7hTOoQ1ztOHtd9W0Pr5WrTcp1c18jWptbaTXpCXFUZJmNTZU90jBSbFZQlUIs fUWosVM2k8kL80SLZWCkEa+Fr2n9cUMugmNLjcha20ubQ1zmdkZWgh2oEoxzB+sGi0i/ 4UIOamV+n262+HcMIGBPfOBc2PfgY2FCY64EXIMIUd3BwdiRvXwB8CmtJy5wQrTbKn5x zlfqmgItkXFu0pPU6Yuvu6/1oyuU6M2UUeIjTeR10xWzlSf33gpQcDRa6siNwgAO+IJ7 meWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LJKFyFZy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cc7si4532428edb.462.2021.03.12.09.38.17; Fri, 12 Mar 2021 09:38:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=LJKFyFZy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232138AbhCLRfQ (ORCPT + 99 others); Fri, 12 Mar 2021 12:35:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232487AbhCLRfA (ORCPT ); Fri, 12 Mar 2021 12:35:00 -0500 Received: from mail-qv1-xf34.google.com (mail-qv1-xf34.google.com [IPv6:2607:f8b0:4864:20::f34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4628C061574 for ; Fri, 12 Mar 2021 09:35:00 -0800 (PST) Received: by mail-qv1-xf34.google.com with SMTP id 30so4613834qva.9 for ; Fri, 12 Mar 2021 09:35:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=0hmXxAjdRD0U+F/dWy2tB3crefQXhKAVgphSKNgBkZI=; b=LJKFyFZysUttEVg/EispgxcOMcKAWVL4b8RRreOAgmXiEXIZNORrkX3bGWMBCKeQwf D6DGD0e7WNL/yjBasxw/GSI1NwSx1+bbq7mRageKwoBPB+paA3PXQCMowrQMqJrO4TgR NIRHv+ZAE+UQJSVNbqBS2C0Pi3k+HnP2iUDQ5c5LrCz+9B8CRo4/dmmYQYXF6AgxIQkd vPjYS4F9KA85o09krYEhE7l66w4kPLQ1WJhYMlojcHB3BQxzRgZxccOXHhbgZYGWzAsn OA/kvorL4k8pHF5z8OogGbv6sTmmKUW8eeZHu9QtbIKQsV78d9W7a4GXVqQ4rSF/l0ip Q4BA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=0hmXxAjdRD0U+F/dWy2tB3crefQXhKAVgphSKNgBkZI=; b=OJE9Y4AqwmAKqmVMM81RYxpXA7RmyVL1dKyK1zNmsa16FvTTdenavuIzr8mI8k9xJx V00LoReScrkMyLk+vHxU2Nuk+C/WHxx3e/KXEulTtdU1dz4tsa8UiVohffzdbJzzPF8v JtcIECY0j9CTxGj/qYaNl41L1xQxwnW3HjI8IEkdcCrdaZ/ggZFXfU+jPojNJ2OPnght 10lJlUJc8jjRlSgBsAFWwUMHl2bqcm6waM3DBaTsORaiPPUjzB6xsLCeLfc2iIZ1C8SY oeFlWs+tGtC5fOpWMrbcRWG/EPOsuE43a1DdmkKOVONA2chFvtbwhogsQ4bfZVvMyhHC UDHA== X-Gm-Message-State: AOAM5316oflEOl5/4x0YPhhd9PpOAuiMRZU7l3YVdoIsPLMjlbk2/4xN USLqaKrpu+dCkKYRRYQ5WK4UTDG6e2oTrhMQ37fOEg== X-Received: by 2002:a0c:8304:: with SMTP id j4mr13448301qva.18.1615570499546; Fri, 12 Mar 2021 09:34:59 -0800 (PST) MIME-Version: 1.0 References: <000000000000b74f1b05bd316729@google.com> <84b0471d-42c1-175f-ae1d-a18c310c7f77@codethink.co.uk> <816870e9-9354-ffbd-936b-40e38e4276a4@codethink.co.uk> <4ce57c7e-6e5d-d136-0a81-395a4207ba44@codethink.co.uk> In-Reply-To: <4ce57c7e-6e5d-d136-0a81-395a4207ba44@codethink.co.uk> From: Dmitry Vyukov Date: Fri, 12 Mar 2021 18:34:48 +0100 Message-ID: Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail To: Ben Dooks Cc: syzbot , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , Daniel Bristot de Oliveira , Benjamin Segall , dietmar.eggemann@arm.com, Juri Lelli , LKML , Mel Gorman , Ingo Molnar , Peter Zijlstra , Steven Rostedt , syzkaller-bugs , Vincent Guittot Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 12, 2021 at 5:36 PM Ben Dooks wrote: > > On 12/03/2021 16:34, Ben Dooks wrote: > > On 12/03/2021 16:30, Ben Dooks wrote: > >> On 12/03/2021 15:12, Dmitry Vyukov wrote: > >>> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks > >>> wrote: > >>>> > >>>> On 10/03/2021 17:16, Dmitry Vyukov wrote: > >>>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot > >>>>> wrote: > >>>>>> > >>>>>> Hello, > >>>>>> > >>>>>> syzbot found the following issue on: > >>>>>> > >>>>>> HEAD commit: 0d7588ab riscv: process: Fix no prototype for > >>>>>> arch_dup_tas.. > >>>>>> git tree: > >>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes > >>>>>> console output: > >>>>>> https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000 > >>>>>> kernel config: > >>>>>> https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136 > >>>>>> dashboard link: > >>>>>> https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69 > >>>>>> userspace arch: riscv64 > >>>>>> > >>>>>> Unfortunately, I don't have any reproducer for this issue yet. > >>>>>> > >>>>>> IMPORTANT: if you fix the issue, please add the following tag to > >>>>>> the commit: > >>>>>> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com > >>>>> > >>>>> +riscv maintainers > >>>>> > >>>>> This is riscv64-specific. > >>>>> I've seen similar crashes in put_user in other places. It looks like > >>>>> put_user crashes in the user address is not mapped/protected (?). > >>>> > >>>> I've been having a look, and this seems to be down to access of the > >>>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a > >>>> bad address to clone? > >>>> > >>>> From looking at the code, the put_user() code should have set the > >>>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the > >>>> s2 register in the crash report) and from looking at the compiler > >>>> output from my gcc-10, the code looks to be dong the relevant csrs > >>>> and then csrc around the put_user > >>>> > >>>> So currently I do not understand how the above could have happened > >>>> over than something re-tried the code seqeunce and ended up retrying > >>>> the faulting instruction without the SR_SUM bit set. > >>> > >>> I would maybe blame qemu for randomly resetting SR_SUM, but it's > >>> strange that 99% of these crashes are in schedule_tail. If it would be > >>> qemu, then they would be more evenly distributed... > >>> > >>> Another observation: looking at a dozen of crash logs, in none of > >>> these cases fuzzer was actually trying to fuzz clone with some insane > >>> arguments. So it looks like completely normal clone's (e..g coming > >>> from pthread_create) result in this crash. > >>> > >>> I also wonder why there is ret_from_exception, is it normal? I see > >>> handle_exception disables SR_SUM: > >>> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73 > >>> > >> > >> So I think if SR_SUM is set, then it faults the access to user memory > >> which the _user() routines clear to allow them access. > >> > >> I'm thinking there is at least one issue here: > >> > >> - the test in fault is the wrong way around for die kernel > >> - the handler only catches this if the page has yet to be mapped. > >> > >> So I think the test should be: > >> > >> if (!user_mode(regs) && addr < TASK_SIZE && > >> unlikely(regs->status & SR_SUM) > >> > >> This then should continue on and allow the rest of the handler to > >> complete mapping the page if it is not there. > >> > >> I have been trying to create a very simple clone test, but so far it > >> has yet to actually trigger anything. > > > > I should have added there doesn't seem to be a good way to use mmap() > > to allocate memory but not insert a vm-mapping post the mmap(). > > > How difficult is it to try building a branch with the above test > modified? I don't have access to hardware, I don't have other qemu versions ready to use. But I can teach you how to run syzkaller locally :) I am not sure anybody run it on real riscv hardware at all. When Tobias ported syzkaller, Tobias also used qemu I think. I am now building with an inverted check to test locally. I don't fully understand but this code, but does handle_exception reset SR_SUM around do_page_fault? If so, then looking at SR_SUM in do_page_fault won't work with positive nor negative check.