Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756634AbbEVGuA (ORCPT ); Fri, 22 May 2015 02:50:00 -0400 Received: from mail-wg0-f44.google.com ([74.125.82.44]:32779 "EHLO mail-wg0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751456AbbEVGt7 (ORCPT ); Fri, 22 May 2015 02:49:59 -0400 Date: Fri, 22 May 2015 08:49:55 +0200 From: Ingo Molnar To: Stephane Eranian Cc: Peter Zijlstra , Vince Weaver , Jiri Olsa , "Liang, Kan" , LKML , Andrew Hunter , Maria Dimakopoulou Subject: Re: [PATCH 01/10] perf,x86: Fix event/group validation Message-ID: <20150522064955.GA26489@gmail.com> References: <20150521125615.GO3644@twins.programming.kicks-ass.net> <20150521130952.GQ3644@twins.programming.kicks-ass.net> <20150521132015.GS3644@twins.programming.kicks-ass.net> <1432214957.30671.0.camel@twins> <1432217038.30671.7.camel@twins> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2957 Lines: 77 * Stephane Eranian wrote: > On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra wrote: > > On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote: > >> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra wrote: > >> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote: > >> >> Or are you talking about a preemption while executing x86_schedule_events()? > >> > > >> > That. > >> > > >> > And we can of course cure that by an earlier patch I send; but I find it > >> > a much simpler rule to just never allow modifying global state for > >> > validation. > >> > >> I can see validation being preempted, but not the context switch code path. > >> Is that what you are talking about? > >> > >> You are saying validate_group() is in the middle of x86_schedule_events() > >> using fake_cpuc, when it gets preempted. The context switch code when it loads > >> the new thread's PMU state calls x86_schedule_events() which modifies the > >> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls > >> nest but they do not touch the same state. > > > > They both touch event->hw->constraint. > > > >> And when you eventually come back > >> to validate_group() you are back to using the fake_cpuc. So I am still not clear > >> on how the corruption can happen. > > > > validate_group() > > x86_schedule_events() > > event->hw.constraint = c; # store > > > > > > perf_task_event_sched_in() > > ... > > x86_schedule_events(); > > event->hw.constraint = c2; # store > > > > ... > > > > put_event_constraints(event); # assume failure to schedule > > intel_put_event_constraints() > > event->hw.constraint = NULL; > > > > > > > > c = event->hw.constraint; # read -> NULL > > > > if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref > > > > > > This in particular is possible when the event in question is a cpu-wide > > event and group-leader, where the validate_group() tries to add an event > > to the group. > > Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc, > it is related to the fact that the constraint is cached in the event > struct itself and that one is shared between validate_group() and > x86_schedule_events() because cpu_hw_event->event_list[] is an array > of pointers to events and not an array of events. Btw., comments and the code structure should be greatly enhanced to make all that very clear and hard to mess up. A month ago perf became fuzzing-proof, and now that's down the drain again... Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/