Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757490AbcJXMER (ORCPT ); Mon, 24 Oct 2016 08:04:17 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54896 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757447AbcJXMEP (ORCPT ); Mon, 24 Oct 2016 08:04:15 -0400 Date: Mon, 24 Oct 2016 14:04:11 +0200 From: Jiri Olsa To: Peter Zijlstra Cc: Oleg Nesterov , "Ni, BaoleX" , "mingo@redhat.com" , "acme@kernel.org" , "linux-kernel@vger.kernel.org" , "alexander.shishkin@linux.intel.com" , "Liu, Chuansheng" , Jiri Olsa Subject: Re: hit a KASan bug related to Perf during stress test Message-ID: <20161024120411.GA27567@krava> References: <318B87A793BE164187D8851D6CE09D64371C8811@shsmsx102.ccr.corp.intel.com> <20161024095341.GF3102@twins.programming.kicks-ass.net> <20161024111526.GA13509@redhat.com> <20161024112732.GJ3102@twins.programming.kicks-ass.net> <20161024112945.GI3157@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161024112945.GI3157@twins.programming.kicks-ass.net> User-Agent: Mutt/1.7.1 (2016-10-04) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 24 Oct 2016 12:04:14 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1110 Lines: 31 On Mon, Oct 24, 2016 at 01:29:45PM +0200, Peter Zijlstra wrote: > On Mon, Oct 24, 2016 at 01:27:32PM +0200, Peter Zijlstra wrote: > > On Mon, Oct 24, 2016 at 01:15:27PM +0200, Oleg Nesterov wrote: > > > How about the trivial fix below? > > > > > > Oleg. > > > > > > --- x/kernel/events/core.c > > > +++ x/kernel/events/core.c > > > @@ -1257,7 +1257,7 @@ static u32 perf_event_pid(struct perf_ev > > > if (event->parent) > > > event = event->parent; > > > > > > - return task_tgid_nr_ns(p, event->ns); > > > + return pid_alive(p) ? task_tgid_nr_ns(p, event->ns) : 0; > > > } > > > > Also, now we get a (few) sample(s) with a different pid:tid than prior > > samples and not matching the sched_switch() events. > > > > I can imagine that being somewhat confusing for people/tools. > > > > Acme/Jolsa, any idea if that will bugger perf-report? > > Hurm, then again, I imagine that after unhash_process the PID/TID could > be instantly re-used and then we're still confused. sounds bad.. I haven't checked the related pid_alive code, but shouldn't we already get the EXIT event in this case? jirka