From: Vince Weaver <vincent.weaver@maine.edu>
Date: Fri, 14 Jul 2017 14:05:51 -0400 (EDT)
To: linux-kernel@vger.kernel.org
cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>,
        Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>,
        Stephane Eranian <eranian@gmail.com>
Subject: perf: bisected sampling bug in Linux 4.11-rc1
Message-ID: <alpine.DEB.2.20.1707141332050.27846@macbook-air>
User-Agent: Alpine 2.20 (DEB 67 2015-01-07)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2424
Lines: 61


I was tracking down some regressions in my perf_event_test testsuite.
Some of the tests broke in the 4.11-rc1 timeframe.

I've bisected one of them, this report is about
	tests/overflow/simul_oneshot_group_overflow
This test creates an event group containing two sampling events, set
to overflow to a signal handler (which disables and then refreshes the 
event).

On a good kernel you get the following:
	Event perf::instructions with period 1000000
	Event perf::instructions with period 2000000
		fd 3 overflows: 946 (perf::instructions/1000000)
		fd 4 overflows: 473 (perf::instructions/2000000)
	Ending counts:
		Count 0: 946379875
		Count 1: 946365218

With the broken kernels you get:
	Event perf::instructions with period 1000000
	Event perf::instructions with period 2000000
		fd 3 overflows: 938 (perf::instructions/1000000)
		fd 4 overflows: 318 (perf::instructions/2000000)
	Ending counts:
		Count 0: 946373080
		Count 1: 653373058


487f05e18aa4efacee6357480f293a5afe6593b5 is the first bad commit

commit 487f05e18aa4efacee6357480f293a5afe6593b5
Author: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Date:   Thu Jan 19 18:43:30 2017 +0200

    perf/core: Optimize event rescheduling on active contexts
    
    When new events are added to an active context, we go and reschedule
    all cpu groups and all task groups in order to preserve the priority
    (cpu pinned, task pinned, cpu flexible, task flexible), but in
    reality we only need to reschedule groups of the same priority as
    that of the events being added, and below.
    
    This patch changes the behavior so that only groups that need to be
    rescheduled are rescheduled.
    
    Reported-by: Adrian Hunter <adrian.hunter@intel.com>
    Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
    Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Stephane Eranian <eranian@google.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Vince Weaver <vincent.weaver@maine.edu>
    Cc: vince@deater.net
    Link: http://lkml.kernel.org/r/20170119164330.22887-3-alexander.shishkin@linux.intel.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>