Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4172024pxj; Tue, 25 May 2021 01:54:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyGr2CM5Cb9Aw/GP2n+R1OYysIVM5aNEVviC3FICKLjKZ7DUA4zQtV1NHGfSl5+jwG8fu7Q X-Received: by 2002:a02:5b0a:: with SMTP id g10mr24485566jab.2.1621932874596; Tue, 25 May 2021 01:54:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621932874; cv=none; d=google.com; s=arc-20160816; b=ej1NtUK6FAy7LZpxXlMo891EVmj5T0fquEJ4x/s9XA4nz3Gq04IduQauX+VFAHu9qU KWuL5+OYvg12c3TbKwANp5vA0I0+rOkDwXxFmvIIt4yigMp9AUPyKvag6PHSp06OncU3 klOkJuSIoVPegbaGpnDxvn2l55bjojNcEdpN35M1xCJSmzfZ9XoxvdMnQqJniubBqOGC 755MI/BvKgPDeeflouR+HCABTkJM8rbxd/M9c7xxKJa04NqziQ6iAyV7z9i6QYhG2nZc RBZEGMuQA6cOqMSbxQ4LmMzS8iUUfDzGo0bgcsUGtZhJ8JA/iKNbv8DWv7dZSSprOTOp cd5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=3L2sKhYgVu4JHTo+b0EBVZURcIoWhyUynzpdSViL1vg=; b=KSqOj6DmqsaNkZazbZBTNg9jPFBz9RZvVdNy+HjsBHUuK41A3Cj80ClhJpTyEYzFCG VMYv4uGvH5DMEOIChpr/EQBxl4yE0DSrdCPDQaIwweuLo6aSAllc8loBSohw03YiHfK4 B7jDzZzNeBDvbVyLPZ/uXYv+S/eoSOQ0+GJI9aA6JfvZm3RzQ9pfsDT6Q2mBt9d8xjH+ DhHQW0X6ZT7zCJkHZefohPY6DzDTUQdfRC4iKSnG/t8We3SyqtLTuPfH96Kh3EDqMAT5 XJiXJBozDhS5908Yz9D5yJ3+k7UUNDzjK7m8nSntbrguFHAgAK4zseufNuyKsxWkC1B9 hR0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TX3w0eqT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 19si17174227ilg.27.2021.05.25.01.54.20; Tue, 25 May 2021 01:54:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=TX3w0eqT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232363AbhEYIft (ORCPT + 99 others); Tue, 25 May 2021 04:35:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231424AbhEYIfY (ORCPT ); Tue, 25 May 2021 04:35:24 -0400 Received: from mail-qk1-x732.google.com (mail-qk1-x732.google.com [IPv6:2607:f8b0:4864:20::732]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5437C06138B for ; Tue, 25 May 2021 01:33:39 -0700 (PDT) Received: by mail-qk1-x732.google.com with SMTP id 124so17309484qkh.10 for ; Tue, 25 May 2021 01:33:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=3L2sKhYgVu4JHTo+b0EBVZURcIoWhyUynzpdSViL1vg=; b=TX3w0eqT+DQ95MJu6K78CoTtR67YpRyrV1CD/NWW2Ddy2rLbVKW9t/Cyvm1XvWT4pD X2kdjNokTKe9Y/UwVvb78/1WsVNGVq97p3+JZm+onDx8nmlR8B3vZ8SAN3zc7deyzdW2 p7VffvOQSS4NqFxFc/VPgmVDXgHB2c95ebU6NJ4FenygYTeTiBIb6FGBct5S6pk2Ukg4 LinenpPpSnwKGNYrgnAWIJazeicbljvHVcm8mdi04SCTrfbgOTaFS8zEyadhq94ElLQq boWr+AGK9MTeiCOTn8pKqtkovkHeYL8h4JUlzYkR0pjruByZTeH33MCHcqZ2DZ3ascGj 8saw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=3L2sKhYgVu4JHTo+b0EBVZURcIoWhyUynzpdSViL1vg=; b=WjX4NiiadaV3a5qGeFKMLHTMXoHnWFs733P0MWuw+HSnoqfsB0qNeMv3XqyucTn5e/ W4YUelfb1YDV3UYqLbNatEQen0olU1E20LC+rhgbesJcTGdU0yff+Dc40MgUpzHiH/GB QnI9XOXX6ugDzFWFoaoUw4WYWmI3ETM5B5qDmJYKW0BtMQetRgkrWvNI/eK3tgGJOKGK CQXc3E9EinD1RvRga7E41pbxWfWprUPGBk10X7NG19IwqKvSWzC1FN4d3clf+PDSBCdq hia4OevmwHOX1b/jHtfSLITgFg5FgyfkJmOP+WDkYWhqktKej6ByQfu/oyk7Gcux15E+ cW0Q== X-Gm-Message-State: AOAM5314mtrisLbSCO7JaoH5a4zisPf5QK5zM7Yv0tJrMcspmK4jxRCd uaP4F3DN0S5UWkcl4Zw4xaB3CkrtG46lRZheaaYT1A== X-Received: by 2002:a37:4694:: with SMTP id t142mr34598680qka.265.1621931618648; Tue, 25 May 2021 01:33:38 -0700 (PDT) MIME-Version: 1.0 References: <000000000000f034fc05c2da6617@google.com> <20210524041350.GJ4441@paulmck-ThinkPad-P17-Gen-1> <20210524224602.GA1963972@paulmck-ThinkPad-P17-Gen-1> <24f352fc-c01e-daa8-5138-1f89f75c7c16@windriver.com> <20210525033355.GN4441@paulmck-ThinkPad-P17-Gen-1> In-Reply-To: <20210525033355.GN4441@paulmck-ThinkPad-P17-Gen-1> From: Dmitry Vyukov Date: Tue, 25 May 2021 10:33:27 +0200 Message-ID: Subject: Re: [syzbot] KASAN: use-after-free Read in check_all_holdout_tasks_trace To: "Paul E. McKenney" Cc: "Xu, Yanfei" , syzbot , rcu@vger.kernel.org, Andrew Morton , Andrii Nakryiko , Alexei Starovoitov , Jens Axboe , bpf , Christian Brauner , Daniel Borkmann , John Fastabend , Martin KaFai Lau , KP Singh , LKML , netdev , Shakeel Butt , Song Liu , syzkaller-bugs , Yonghong Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 25, 2021 at 5:33 AM Paul E. McKenney wrote= : > > On Tue, May 25, 2021 at 10:31:55AM +0800, Xu, Yanfei wrote: > > > > > > On 5/25/21 6:46 AM, Paul E. McKenney wrote: > > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > > > > > On Sun, May 23, 2021 at 09:13:50PM -0700, Paul E. McKenney wrote: > > > > On Sun, May 23, 2021 at 08:51:56AM +0200, Dmitry Vyukov wrote: > > > > > On Fri, May 21, 2021 at 7:29 PM syzbot > > > > > wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > > > HEAD commit: f18ba26d libbpf: Add selftests for TC-BPF manag= ement API > > > > > > git tree: bpf-next > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=3D17f= 50d1ed00000 > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=3D8ff= 54addde0afb5d > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=3D7b2b1= 3f4943374609532 > > > > > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag t= o the commit: > > > > > > Reported-by: syzbot+7b2b13f4943374609532@syzkaller.appspotmail.= com > > > > > > > > > > This looks rcu-related. +rcu mailing list > > > > > > > > I think I see a possible cause for this, and will say more after so= me > > > > testing and after becoming more awake Monday morning, Pacific time. > > > > > > No joy. From what I can see, within RCU Tasks Trace, the calls to > > > get_task_struct() are properly protected (either by RCU or by an earl= ier > > > get_task_struct()), and the calls to put_task_struct() are balanced b= y > > > those to get_task_struct(). > > > > > > I could of course have missed something, but at this point I am suspe= cting > > > an unbalanced put_task_struct() has been added elsewhere. > > > > > > As always, extra eyes on this code would be a good thing. > > > > > > If it were reproducible, I would of course suggest bisection. :-/ > > > > > > Thanx, Paul > > > > > Hi Paul, > > > > Could it be? > > > > CPU1 CPU2 > > trc_add_holdout(t, bhp) > > //t->usage=3D=3D2 > > release_task > > put_task_struct_rcu_user > > delayed_put_task_struct > > ...... > > put_task_struct(t) > > //t->usage=3D=3D1 > > > > check_all_holdout_tasks_trace > > ->trc_wait_for_one_reader > > ->trc_del_holdout > > ->put_task_struct(t) > > //t->usage=3D=3D0 and task_struct freed > > READ_ONCE(t->trc_reader_checked) > > //ops=EF=BC=8C t had been freed. > > > > So, after excuting trc_wait_for_one_reader=EF=BC=88=EF=BC=89, task migh= t had been removed > > from holdout list and the corresponding task_struct was freed. > > And we shouldn't do READ_ONCE(t->trc_reader_checked). > > I was suspicious of that call to trc_del_holdout() from within > trc_wait_for_one_reader(), but the only time it executes is in the > context of the current running task, which means that CPU 2 had better > not be invoking release_task() on it just yet. > > Or am I missing your point? > > Of course, if you can reproduce it, the following patch might be > an interesting thing to try, my doubts notwithstanding. But more > important, please check the patch to make sure that we are both > talking about the same call to trc_del_holdout()! > > If we are talking about the same call to trc_del_holdout(), are you > actually seeing that code execute except when rcu_tasks_trace_pertask() > calls trc_wait_for_one_reader()? > > > I investigate the trc_wait_for_one_reader=EF=BC=88=EF=BC=89 and found b= efore we excute > > trc_del_holdout, there is always set t->trc_reader_checked=3Dtrue. How = about > > we just set the checked flag and unified excute trc_del_holdout() > > in check_all_holdout_tasks_trace with checking the flag? > > The problem is that we cannot execute trc_del_holdout() except in > the context of the RCU Tasks Trace grace-period kthread. So it is > necessary to manipulate ->trc_reader_checked separately from the list > in order to safely synchronize with IPIs and with the exit code path > for any reader tasks, see for example trc_read_check_handler() and > exit_tasks_rcu_finish_trace(). > > Or are you thinking of some other approach? This could be caused by a buggy extra put_pid somewhere else, right? If so, I suspect that's what may be happening. We've 2 very similar use-after-free reports on an internal kernel, but it also has a number of other use-after-free reports in pid-related functions (pid_task/pid_nr_ns/attach_pid). One of them is happening relatively frequently (150 crashes) and is caused by something in the tty subsystem. Presumably it may be causing one off use-after-free's in other random places of the kernel as well. Unfortunately these crashes don't happen on the upstream kernel (at least not yet). So if you don't see any obvious smoking gun in rcu, I think we can assume for now that it's due to tty. > Thanx, Paul > > ------------------------------------------------------------------------ > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > index efb8127f3a36..2a0d4bdd619a 100644 > --- a/kernel/rcu/tasks.h > +++ b/kernel/rcu/tasks.h > @@ -987,7 +987,6 @@ static void trc_wait_for_one_reader(struct task_struc= t *t, > // The current task had better be in a quiescent state. > if (t =3D=3D current) { > t->trc_reader_checked =3D true; > - trc_del_holdout(t); > WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting)); > return; > } > > -- > You received this message because you are subscribed to the Google Groups= "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgi= d/syzkaller-bugs/20210525033355.GN4441%40paulmck-ThinkPad-P17-Gen-1.