Hello
My student actually noticed this before I did, I was hoping it was some
sort of error in her data.
Anyway all perf_event functionality (especially reads) has become about
20x slower, at least on Intel machines (haswell and skylake are the only
ones I've tested) sometime between 4.9 and 4.11-rc
For example, in the PAPI tests:
4.11-rc2
Total cost for PAPI_read (2 counters) over 1000000 iterations
min cycles : 15192
max cycles : 3887735
mean cycles : 15662.057418
std deviation: 19079.398693
4.9
Total cost for PAPI_read (2 counters) over 1000000 iterations
min cycles : 864
max cycles : 78459
mean cycles : 908.010315
std deviation: 144.875697
The perf_event_test validation tests are also showing this, even when
using rdpmc() rather than read.
Is there a likely change that might have caused this? Hoping to avoid
bisecting it as that will kill the rest of the week probably.
Vince
On Thu, 16 Mar 2017, Vince Weaver wrote:
> Anyway all perf_event functionality (especially reads) has become about
> 20x slower, at least on Intel machines (haswell and skylake are the only
> ones I've tested) sometime between 4.9 and 4.11-rc
False alarm, I forgot I had debugging (KASAN, etc) turned on in this
kernel tree because I had been fuzzing.
sorry about that.
Vince
On Thu, Mar 16, 2017 at 11:54:58AM -0400, Vince Weaver wrote:
> Hello
>
> My student actually noticed this before I did, I was hoping it was some
> sort of error in her data.
>
> Anyway all perf_event functionality (especially reads) has become about
> 20x slower, at least on Intel machines (haswell and skylake are the only
> ones I've tested) sometime between 4.9 and 4.11-rc
>
> For example, in the PAPI tests:
>
> 4.11-rc2
>
> Total cost for PAPI_read (2 counters) over 1000000 iterations
> min cycles : 15192
> max cycles : 3887735
> mean cycles : 15662.057418
> std deviation: 19079.398693
>
>
> 4.9
>
> Total cost for PAPI_read (2 counters) over 1000000 iterations
> min cycles : 864
> max cycles : 78459
> mean cycles : 908.010315
> std deviation: 144.875697
>
>
> The perf_event_test validation tests are also showing this, even when
> using rdpmc() rather than read.
>
> Is there a likely change that might have caused this? Hoping to avoid
> bisecting it as that will kill the rest of the week probably.
No immediate clue; but I can have a poke. Do you have a handy way of
showing this using: perf test rdpmc, or should I hack something
together?
On Thu, Mar 16, 2017 at 01:57:31PM -0400, Vince Weaver wrote:
> On Thu, 16 Mar 2017, Vince Weaver wrote:
>
> > Anyway all perf_event functionality (especially reads) has become about
> > 20x slower, at least on Intel machines (haswell and skylake are the only
> > ones I've tested) sometime between 4.9 and 4.11-rc
>
> False alarm, I forgot I had debugging (KASAN, etc) turned on in this
> kernel tree because I had been fuzzing.
>
> sorry about that.
Ha! no problem.