Hi,
While working on optionally jit-compiling parts of postgres using llvm
(MCJIT currently, but Orc would have the same issue afaics), I'm trying
to use perf jit support to make profiling of those JITed parts easier.
Turns out the current jit support in perf doesn't work that well for
LLVM - but it doesn't primarily look like LLVM's fault. Syscall-wise
llvm does (heavily filtered):
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866d000
mprotect(0x7efd3866e000, 4096, PROT_READ|PROT_EXEC) = 0
mprotect(0x7efd3866d000, 4096, PROT_READ|PROT_EXEC) = 0
write(2, "Function loaded: evalexpr0 at 139626038091776 0x7efd3866e000 len 69", 68) = 68
mmap(0x7efd3866f000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866c000
mmap(0x7efd3866e000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd3866b000
mprotect(0x7efd3866c000, 4096, PROT_READ|PROT_EXEC) = 0
mprotect(0x7efd3866b000, 4096, PROT_READ|PROT_EXEC) = 0
write(2, "Function loaded: evalexpr1 at 139626038083584 0x7efd3866c000 len 69", 68) = 68
...
i.e. it mmaps single pages for the each JITed function's sections. Which
makes sense, because the first function is JITed independently from the
second one.
The corresponding MMAP2 records according to perf perf script
--show-mmap-events are:
postgres 4107 595444.867737: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866e000(0x1000) @ 0x7efd3866e000 00:00 0 0]: ---p //anon
postgres 4107 595444.867825: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866d000(0x2000) @ 0x7efd3866d000 00:00 0 0]: ---p //anon
postgres 4107 595444.884090: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866c000(0x3000) @ 0x7efd3866c000 00:00 0 0]: ---p //anon
postgres 4107 595444.884113: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866b000(0x4000) @ 0x7efd3866b000 00:00 0 0]: ---p //anon
Note how the size of the mapping continually increases, so that the each
MMAP2 record covers previous sections.
If one perf inject --jit into that it looks like:
postgres 4107 595444.867737: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866e000(0x1000) @ 0x7efd3866e000 00:00 0 0]: ---p //anon
postgres 4107 595444.867825: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866d000(0x2000) @ 0x7efd3866d000 00:00 0 0]: ---p //anon
postgres 4107 595444.868140: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866e000(0x45) @ 0x40 fd:02 33434534 1]: --xs /home/andres/.debug/jit/llvm-IR-jit-20161209.XXfN0K3O/jitted-4107-1.so
postgres 4107 595444.884090: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866c000(0x3000) @ 0x7efd3866c000 00:00 0 0]: ---p //anon
postgres 4107 595444.884113: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866b000(0x4000) @ 0x7efd3866b000 00:00 0 0]: ---p //anon
postgres 4107 595444.884232: PERF_RECORD_MMAP2 4107/4107: [0x7efd3866c000(0x45) @ 0x40 fd:02 33434599 1]: --xs /home/andres/.debug/jit/llvm-IR-jit-20161209.XXfN0K3O/jitted-4107-2.so
Note how the first injected record is also covered by the following
"//anon" event. This leads to the the curious effect that samples for
the first function (evalexpr0) are associated with the right generated
.so, until the second function is JITed.
I hacked up perf inject to omit such MMAP2 records by adding
if (event->mmap2.prot == 0)
return 0;
to perf_event__jit_repipe_mmap2() and suddenly things work.
I presume the increasing MMAP2 size is triggered by the consecutive
pages being represented as a single page-range in the kernel?
If I, to work around such consecutive pages, force another page to be
mmap()ed inbetween, and avoid using MAP_ANONYMOUS, the problem also goes
away.
Am I doing something wrong, or is there a bug here?
FWIW, this is on linux 4.8.8, with perf from master
(v4.9-rc8-108-g810ac7b7558d).
BTW, it's also a bit weird that those MMAP2 records triggered by
mprotect/mmap, have prot set to 0...
Regards,
Andres
On Fri, Dec 09, 2016 at 09:02:18PM -0800, Andres Freund wrote:
>
> I presume the increasing MMAP2 size is triggered by the consecutive
> pages being represented as a single page-range in the kernel?
Yes, we print struct vm_area_struct based information, if vmas get
merged, that shows.
> If I, to work around such consecutive pages, force another page to be
> mmap()ed inbetween, and avoid using MAP_ANONYMOUS, the problem also goes
> away.
This would indeed inhibit vma merging.
> Am I doing something wrong, or is there a bug here?
Expected behaviour afaict
> BTW, it's also a bit weird that those MMAP2 records triggered by
> mprotect/mmap, have prot set to 0...
Yes, mprotect() does: vma->vm_flags = newflags; before calling
perf_event_mmap(vma); which then looks at VM_{READ,WRITE,EXEC} bits in
that word to generate the prot value.
So that being 0 is a bit weird.
On December 12, 2016 12:49:03 AM PST, Peter Zijlstra <[email protected]> wrote:
>On Fri, Dec 09, 2016 at 09:02:18PM -0800, Andres Freund wrote:
>> Am I doing something wrong, or is there a bug here?
>
>Expected behaviour afaict
So I need to prevent vma merging to use perf jit support? That seems a bit weird. Possibly the inject --jit pass needs to do something about this? A hack might be to re-emit the fake mmap2 records after ones potentially overwriting previously emitted records. Don't know perf internals well enough to suggest something prettier.
Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On Mon, Dec 12, 2016 at 01:01:48AM -0800, Andres Freund wrote:
>
>
> On December 12, 2016 12:49:03 AM PST, Peter Zijlstra <[email protected]> wrote:
> >On Fri, Dec 09, 2016 at 09:02:18PM -0800, Andres Freund wrote:
>
> >> Am I doing something wrong, or is there a bug here?
> >
> >Expected behaviour afaict
>
> So I need to prevent vma merging to use perf jit support? That seems
> a bit weird.
No idea about that, I don't really have the whole JIT stuff present atm.
I just know how the mmap stuff works.
Stephane?