Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932217AbbEUPLN (ORCPT ); Thu, 21 May 2015 11:11:13 -0400 Received: from mail-ob0-f182.google.com ([209.85.214.182]:33243 "EHLO mail-ob0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755926AbbEUPLI (ORCPT ); Thu, 21 May 2015 11:11:08 -0400 MIME-Version: 1.0 In-Reply-To: <1432217038.30671.7.camel@twins> References: <20150521111710.475482798@infradead.org> <20150521111932.592505273@infradead.org> <20150521125615.GO3644@twins.programming.kicks-ass.net> <20150521130952.GQ3644@twins.programming.kicks-ass.net> <20150521132015.GS3644@twins.programming.kicks-ass.net> <1432214957.30671.0.camel@twins> <1432217038.30671.7.camel@twins> Date: Thu, 21 May 2015 08:11:06 -0700 Message-ID: Subject: Re: [PATCH 01/10] perf,x86: Fix event/group validation From: Stephane Eranian To: Peter Zijlstra Cc: Ingo Molnar , Vince Weaver , Jiri Olsa , "Liang, Kan" , LKML , Andrew Hunter , Maria Dimakopoulou Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2576 Lines: 64 On Thu, May 21, 2015 at 7:03 AM, Peter Zijlstra wrote: > On Thu, 2015-05-21 at 06:36 -0700, Stephane Eranian wrote: >> On Thu, May 21, 2015 at 6:29 AM, Peter Zijlstra wrote: >> > On Thu, 2015-05-21 at 06:27 -0700, Stephane Eranian wrote: >> >> Or are you talking about a preemption while executing x86_schedule_events()? >> > >> > That. >> > >> > And we can of course cure that by an earlier patch I send; but I find it >> > a much simpler rule to just never allow modifying global state for >> > validation. >> >> I can see validation being preempted, but not the context switch code path. >> Is that what you are talking about? >> >> You are saying validate_group() is in the middle of x86_schedule_events() >> using fake_cpuc, when it gets preempted. The context switch code when it loads >> the new thread's PMU state calls x86_schedule_events() which modifies the >> cpuc->event_list[]->hwc. But this is cpuc vs. fake_cpuc again. So yes, the calls >> nest but they do not touch the same state. > > They both touch event->hw->constraint. > >> And when you eventually come back >> to validate_group() you are back to using the fake_cpuc. So I am still not clear >> on how the corruption can happen. > > validate_group() > x86_schedule_events() > event->hw.constraint = c; # store > > > perf_task_event_sched_in() > ... > x86_schedule_events(); > event->hw.constraint = c2; # store > > ... > > put_event_constraints(event); # assume failure to schedule > intel_put_event_constraints() > event->hw.constraint = NULL; > > > > c = event->hw.constraint; # read -> NULL > > if (!test_bit(hwc->idx, c->idxmsk)) # <- *BOOM* NULL deref > > > This in particular is possible when the event in question is a cpu-wide > event and group-leader, where the validate_group() tries to add an event > to the group. > Ok, I think I get it now. It is not related to fake_cpuc vs. cpuc, it is related to the fact that the constraint is cached in the event struct itself and that one is shared between validate_group() and x86_schedule_events() because cpu_hw_event->event_list[] is an array of pointers to events and not an array of events. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/