Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935553AbcJXMMM (ORCPT ); Mon, 24 Oct 2016 08:12:12 -0400 Received: from merlin.infradead.org ([205.233.59.134]:55666 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934818AbcJXMMG (ORCPT ); Mon, 24 Oct 2016 08:12:06 -0400 Date: Mon, 24 Oct 2016 14:11:55 +0200 From: Peter Zijlstra To: Oleg Nesterov Cc: "Ni, BaoleX" , "mingo@redhat.com" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "alexander.shishkin@linux.intel.com" , "Liu, Chuansheng" Subject: Re: hit a KASan bug related to Perf during stress test Message-ID: <20161024121155.GK3102@twins.programming.kicks-ass.net> References: <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com> <20161024095341.GF3102@twins.programming.kicks-ass.net> <20161024111526.GA13509@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161024111526.GA13509@redhat.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1107 Lines: 32 On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > Yes, current is still valid. > > But nothing protects current->group_leader or parent/real_parent, they > can point to the exited/freed task. We really need to nullify them in > __unhash_process() to catch the problems like this, I wanted to do this > many times... > > So you simply can't know your tgid or even tid after release_task() calls > __unhash_process(). Actually after exit_notify() unless the exiting task > autoreaps itself. > > How about the trivial fix below? > > Oleg. > > --- x/kernel/events/core.c > +++ x/kernel/events/core.c > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > if (event->parent) > event = event->parent; > > - return task_tgid_nr_ns(p, event->ns); > + return pid_alive(p) ? task_tgid_nr_ns(p, event->ns) : 0; > } > > static u32 perf_event_tid(struct perf_event *event, struct task_struct *p) Should we do the same for perf_event_tid() and report -1 as the pid/tid in the !alive case? -1 should be an obvious invalid pid since we limit the pid-space to less than 32 bits.