Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965083AbcKQRo1 (ORCPT ); Thu, 17 Nov 2016 12:44:27 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40036 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964938AbcKQRoU (ORCPT ); Thu, 17 Nov 2016 12:44:20 -0500 Date: Thu, 17 Nov 2016 09:18:48 -0600 From: Josh Poimboeuf To: Peter Zijlstra Cc: Vince Weaver , "linux-kernel@vger.kernel.org" , Ingo Molnar , Arnaldo Carvalho de Melo , "davej@codemonkey.org.uk" , "dvyukov@google.com" , Stephane Eranian Subject: Re: perf: fuzzer KASAN unwind_get_return_address Message-ID: <20161117151848.7sdss3g4waanxfsk@treble> References: <20161115185756.GL3142@twins.programming.kicks-ass.net> <20161115205748.xtroftp55igs55bz@treble> <20161116130337.GT3142@twins.programming.kicks-ass.net> <20161116143746.zoxdxrfqvmx35wln@treble> <20161116144943.GB3117@twins.programming.kicks-ass.net> <20161116145849.GR3157@twins.programming.kicks-ass.net> <20161117044828.vedc3whqkuki624r@treble> <20161117090446.GC3142@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20161117090446.GC3142@twins.programming.kicks-ass.net> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Thu, 17 Nov 2016 15:18:50 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 980 Lines: 24 On Thu, Nov 17, 2016 at 10:04:46AM +0100, Peter Zijlstra wrote: > On Wed, Nov 16, 2016 at 10:48:28PM -0600, Josh Poimboeuf wrote: > > Peter or Vince, can you try to recreate with this patch? It dumps the > > raw stack contents during a stack dump. Hopefully that would give a > > clue about what's going wrong. > > > Here goes... I'll do another run and get you the results of that as > well. Thanks, I just waded through this and it turned up some good clues. And according to 'git blame', you might be able to help :-) It's not stack corruption. Instead it looks like __intel_pmu_pebs_event() is creating a bad or stale pt_regs which gets passed to the unwinder. Specifically, regs->bp points to a seemingly random address on the NMI stack. Which seems odd, considering the code itself is running on the same NMI stack. I don't know much about the PEBS code but it seems like it's passing some stale data. Either that or there's some NMI nesting going on. -- Josh