Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3986463pxj; Mon, 24 May 2021 20:38:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxDx3iGFqeBO7tkLWyadyDv7oq5s2xR2XuwKW6p4OmK2iSlojVlKaJPRZXzO02M5s1Coc/+ X-Received: by 2002:a17:906:bcf9:: with SMTP id op25mr27266901ejb.453.1621913933729; Mon, 24 May 2021 20:38:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621913933; cv=none; d=google.com; s=arc-20160816; b=Y/nqeKamemguGhIvv8y3DbhsugLimC9s8QWV2G4L/4FTF1D/VsnN8eoFC8ZIHkG5mX Q9/9JYRhFojeW7pTv7WTDqLkqz4zSVjHx9fwiko820O/9uan0ti6twI8enVYAB/qKawv bQ9h39FdsmOWJ5c1oXjNfRbKQuAOHx0ygOZ/Qa82e1OsYyQIi0ifE/pCbMvd0TI2zLMC jXmHC/Ck4FcXiCnimk/OmUYTxaKgvu9amMuctEnYRu3veNputj3k3noG0jJgtMWQfesY S1K9XOfzhGvyxyDEsZVGnULqIlt5HT1zZ9qhkKmGMR8KgDqLCxD+nvY7IDcIvJ2zkN9Z d6iw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=wNc+z8LhLe1RBKnqRWJiJO6aUZPBf7ZND2jxakTXh+c=; b=tRVFCSpvzuJ08ylzm1YrpkCeYGtOAy5/J+L/XAjvxf1FwJaJ9oaJkGlAFCHvOrS6XW MPVoTpPjvUhl4CBiJs0aZHf7/pFE25yLckbFX1/KKZbJ2goG6TkUjczDfD4CI8SRdpZq ox//5E+NsOVlmUZsPqxYit6wxtnaIW1eLuXyvop3Wwekk8aw0vjgW3RQwKaHcDAkEe4q 7svqZf6wAqRGuUZ0NDYwYuVPdTAVooQ/KBK1lLwcSwqYlyXD8N4XQ6LuD7tCMWyTqLmA XOSRtvNve3avd0gKjdevmzFwZ7KEphzMHrZe5CJnVyM2dlcnbUFjPhhlDY3PpOFMK4mI 1oaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=sBzbYmQG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d24si13125921edv.171.2021.05.24.20.38.29; Mon, 24 May 2021 20:38:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=sBzbYmQG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230248AbhEYDf0 (ORCPT + 99 others); Mon, 24 May 2021 23:35:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:55776 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229598AbhEYDfZ (ORCPT ); Mon, 24 May 2021 23:35:25 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 129A9613D8; Tue, 25 May 2021 03:33:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1621913636; bh=jbjn/vKUEDevK4XHR8qG+RAbg5jKnkEBVFZyUjV4JgA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=sBzbYmQGgUhb/ULTiPf716X/9EUYtWjtFuK/K2nPHIw9CrhvLFMRWHHoalJTpin7U asRgxEOx9OZr6Q98N1IW6pjX9JXgdhLuU6paa8ZT5XOOjXFbdnbMpgDlIkJobehxIo 5BkfGRh4RaCX51evCYHCn3Pd6W11WaN+4MDkQuZZijBaOMBGghqRHTwxXoAbOPxHGU Id32mxHzwOD0/QWTCVbwHDqMl0z3CqsEpHhVLEBIc9pnhGR4o72hTVkIc8aHtJyIgM j+l5PUthQQa/GVgP2/0KBvaNZcHDysi+btc4j4Au5OuG6+oUbw631rAY2w6y4yW5k0 fNZ8Hi2qyduwg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id D34D45C039E; Mon, 24 May 2021 20:33:55 -0700 (PDT) Date: Mon, 24 May 2021 20:33:55 -0700 From: "Paul E. McKenney" To: "Xu, Yanfei" Cc: Dmitry Vyukov , syzbot , rcu@vger.kernel.org, Andrew Morton , Andrii Nakryiko , Alexei Starovoitov , Jens Axboe , bpf , Christian Brauner , Daniel Borkmann , John Fastabend , Martin KaFai Lau , KP Singh , LKML , netdev , Shakeel Butt , Song Liu , syzkaller-bugs , Yonghong Song Subject: Re: [syzbot] KASAN: use-after-free Read in check_all_holdout_tasks_trace Message-ID: <20210525033355.GN4441@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <000000000000f034fc05c2da6617@google.com> <20210524041350.GJ4441@paulmck-ThinkPad-P17-Gen-1> <20210524224602.GA1963972@paulmck-ThinkPad-P17-Gen-1> <24f352fc-c01e-daa8-5138-1f89f75c7c16@windriver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <24f352fc-c01e-daa8-5138-1f89f75c7c16@windriver.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 25, 2021 at 10:31:55AM +0800, Xu, Yanfei wrote: > > > On 5/25/21 6:46 AM, Paul E. McKenney wrote: > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > > > On Sun, May 23, 2021 at 09:13:50PM -0700, Paul E. McKenney wrote: > > > On Sun, May 23, 2021 at 08:51:56AM +0200, Dmitry Vyukov wrote: > > > > On Fri, May 21, 2021 at 7:29 PM syzbot > > > > wrote: > > > > > > > > > > Hello, > > > > > > > > > > syzbot found the following issue on: > > > > > > > > > > HEAD commit: f18ba26d libbpf: Add selftests for TC-BPF management API > > > > > git tree: bpf-next > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=17f50d1ed00000 > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=8ff54addde0afb5d > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=7b2b13f4943374609532 > > > > > > > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > > > > Reported-by: syzbot+7b2b13f4943374609532@syzkaller.appspotmail.com > > > > > > > > This looks rcu-related. +rcu mailing list > > > > > > I think I see a possible cause for this, and will say more after some > > > testing and after becoming more awake Monday morning, Pacific time. > > > > No joy. From what I can see, within RCU Tasks Trace, the calls to > > get_task_struct() are properly protected (either by RCU or by an earlier > > get_task_struct()), and the calls to put_task_struct() are balanced by > > those to get_task_struct(). > > > > I could of course have missed something, but at this point I am suspecting > > an unbalanced put_task_struct() has been added elsewhere. > > > > As always, extra eyes on this code would be a good thing. > > > > If it were reproducible, I would of course suggest bisection. :-/ > > > > Thanx, Paul > > > Hi Paul, > > Could it be? > > CPU1 CPU2 > trc_add_holdout(t, bhp) > //t->usage==2 > release_task > put_task_struct_rcu_user > delayed_put_task_struct > ...... > put_task_struct(t) > //t->usage==1 > > check_all_holdout_tasks_trace > ->trc_wait_for_one_reader > ->trc_del_holdout > ->put_task_struct(t) > //t->usage==0 and task_struct freed > READ_ONCE(t->trc_reader_checked) > //ops, t had been freed. > > So, after excuting trc_wait_for_one_reader(), task might had been removed > from holdout list and the corresponding task_struct was freed. > And we shouldn't do READ_ONCE(t->trc_reader_checked). I was suspicious of that call to trc_del_holdout() from within trc_wait_for_one_reader(), but the only time it executes is in the context of the current running task, which means that CPU 2 had better not be invoking release_task() on it just yet. Or am I missing your point? Of course, if you can reproduce it, the following patch might be an interesting thing to try, my doubts notwithstanding. But more important, please check the patch to make sure that we are both talking about the same call to trc_del_holdout()! If we are talking about the same call to trc_del_holdout(), are you actually seeing that code execute except when rcu_tasks_trace_pertask() calls trc_wait_for_one_reader()? > I investigate the trc_wait_for_one_reader() and found before we excute > trc_del_holdout, there is always set t->trc_reader_checked=true. How about > we just set the checked flag and unified excute trc_del_holdout() > in check_all_holdout_tasks_trace with checking the flag? The problem is that we cannot execute trc_del_holdout() except in the context of the RCU Tasks Trace grace-period kthread. So it is necessary to manipulate ->trc_reader_checked separately from the list in order to safely synchronize with IPIs and with the exit code path for any reader tasks, see for example trc_read_check_handler() and exit_tasks_rcu_finish_trace(). Or are you thinking of some other approach? Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index efb8127f3a36..2a0d4bdd619a 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -987,7 +987,6 @@ static void trc_wait_for_one_reader(struct task_struct *t, // The current task had better be in a quiescent state. if (t == current) { t->trc_reader_checked = true; - trc_del_holdout(t); WARN_ON_ONCE(READ_ONCE(t->trc_reader_nesting)); return; }