Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936362AbcJXOgx (ORCPT ); Mon, 24 Oct 2016 10:36:53 -0400 Received: from merlin.infradead.org ([205.233.59.134]:56170 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935051AbcJXOgt (ORCPT ); Mon, 24 Oct 2016 10:36:49 -0400 Date: Mon, 24 Oct 2016 16:36:46 +0200 From: Peter Zijlstra To: Oleg Nesterov Cc: "Ni, BaoleX" , "mingo@redhat.com" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "alexander.shishkin@linux.intel.com" , "Liu, Chuansheng" Subject: Re: hit a KASan bug related to Perf during stress test Message-ID: <20161024143646.GR3102@twins.programming.kicks-ass.net> References: <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com> <20161024095341.GF3102@twins.programming.kicks-ass.net> <20161024111526.GA13509@redhat.com> <20161024112402.GI3102@twins.programming.kicks-ass.net> <20161024120231.GA16554@redhat.com> <20161024121030.GA17007@redhat.com> <20161024122210.GM3102@twins.programming.kicks-ass.net> <20161024122942.GC17007@redhat.com> <20161024123814.GP3102@twins.programming.kicks-ass.net> <20161024132555.GA18410@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161024132555.GA18410@redhat.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1799 Lines: 56 On Mon, Oct 24, 2016 at 03:25:55PM +0200, Oleg Nesterov wrote: > Well, if we add that PIDTYPE_TGID hack, I think we can do something > like below... > > Or do you think we should add a perf_alive() check into perf_event_pid() > for a quick fix? That is what I was thinking. Then we don't need to do the TGID hack, I suspect some people might object to that. > Either way it's a pity we can't report at least the valid tid, perhaps > perf_event_tid() could use task_pid_nr() if event->ns == init_pid_ns, > I dunno. Right, but after unhash is there really still the notion of a valid TID? I mean, the TID can be reused, at which point you'll end up with two tasks etc.. But yes, very tedious. I was thinking something like so? --- kernel/events/core.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index c6e47e97b33f..2c9a22485e9e 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1257,7 +1257,14 @@ static u32 perf_event_pid(struct perf_event *event, struct task_struct *p) if (event->parent) event = event->parent; - return task_tgid_nr_ns(p, event->ns); + /* + * It is possible the task already got unhashed, in which case we + * cannot determine the current->group_leader/real_parent. + * + * Also, report -1 to indicate unhashed, so as not to confused with + * 0 for the idle task. + */ + return pid_alive(p) ? task_tgid_nr_ns(p, event->ns) : ~0; } static u32 perf_event_tid(struct perf_event *event, struct task_struct *p) @@ -1268,7 +1275,7 @@ static u32 perf_event_tid(struct perf_event *event, struct task_struct *p) if (event->parent) event = event->parent; - return task_pid_nr_ns(p, event->ns); + return pid_alive(p) ? task_pid_nr_ns(p, event->ns) : ~0; } /*