2018-09-24 13:09:34

by Yury Norov

[permalink] [raw]
Subject: [PATCH] mm: fix COW faults after mlock()

After mlock() on newly mmap()ed shared memory I observe page faults.

The problem is that populate_vma_page_range() doesn't set FOLL_WRITE
flag for writable shared memory in mlock() path, arguing that like:
/*
* We want to touch writable mappings with a write fault in order
* to break COW, except for shared mappings because these don't COW
* and we would not want to dirty them for nothing.
*/

But they are actually COWed. The most straightforward way to avoid it
is to set FOLL_WRITE flag for shared mappings as well as for private ones.

This is the partial revert of commit 5ecfda041e4b4 ("mlock: avoid
dirtying pages and triggering writeback"). So it re-enables dirtying.

The fix works for me (arm64, kernel v4.19-rc4 and v4.9), but after digging
into the code I still don't understand why we need to do copy-on-write on
shared memory. If comment above was correct when 5ecfda041e4b4 became
upstreamed (2011), shared mappings were not COWed back in 2011, but are
COWed now. If so, this is another issue to be fixed.

Signed-off-by: Yury Norov <[email protected]>
---
mm/gup.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 1abc8b4afff6..1899e8bac06b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1202,10 +1202,9 @@ long populate_vma_page_range(struct vm_area_struct *vma,
gup_flags &= ~FOLL_POPULATE;
/*
* We want to touch writable mappings with a write fault in order
- * to break COW, except for shared mappings because these don't COW
- * and we would not want to dirty them for nothing.
+ * to break COW.
*/
- if ((vma->vm_flags & (VM_WRITE | VM_SHARED)) == VM_WRITE)
+ if (vma->vm_flags & VM_WRITE)
gup_flags |= FOLL_WRITE;

/*
--
2.17.1



2018-09-24 21:23:51

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix COW faults after mlock()

On Mon, Sep 24, 2018 at 04:08:52PM +0300, Yury Norov wrote:
> After mlock() on newly mmap()ed shared memory I observe page faults.
>
> The problem is that populate_vma_page_range() doesn't set FOLL_WRITE
> flag for writable shared memory in mlock() path, arguing that like:
> /*
> * We want to touch writable mappings with a write fault in order
> * to break COW, except for shared mappings because these don't COW
> * and we would not want to dirty them for nothing.
> */
>
> But they are actually COWed. The most straightforward way to avoid it
> is to set FOLL_WRITE flag for shared mappings as well as for private ones.

Huh? How do shared mapping get CoWed?

In this context CoW means to create a private copy of the page for the
process. It only makes sense for private mappings as all pages in shared
mappings do not belong to the process.

Shared mappings will still get faults, but a bit later -- after the page
is written back to disc, the page get clear and write protected to catch
the next write access.

Noticeable exception is tmpfs/shmem. These pages do not belong to normal
write back process. But the code path is used for other filesystems as
well.

Therefore, NAK. You only create unneeded write back traffic.

--
Kirill A. Shutemov

2018-09-24 23:49:25

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix COW faults after mlock()

On Tue, Sep 25, 2018 at 12:22:47AM +0300, Kirill A. Shutemov wrote:
> External Email
>
> On Mon, Sep 24, 2018 at 04:08:52PM +0300, Yury Norov wrote:
> > After mlock() on newly mmap()ed shared memory I observe page faults.
> >
> > The problem is that populate_vma_page_range() doesn't set FOLL_WRITE
> > flag for writable shared memory in mlock() path, arguing that like:
> > /*
> > * We want to touch writable mappings with a write fault in order
> > * to break COW, except for shared mappings because these don't COW
> > * and we would not want to dirty them for nothing.
> > */
> >
> > But they are actually COWed. The most straightforward way to avoid it
> > is to set FOLL_WRITE flag for shared mappings as well as for private ones.
>
> Huh? How do shared mapping get CoWed?
>
> In this context CoW means to create a private copy of the page for the
> process. It only makes sense for private mappings as all pages in shared
> mappings do not belong to the process.
>
> Shared mappings will still get faults, but a bit later -- after the page
> is written back to disc, the page get clear and write protected to catch
> the next write access.
>
> Noticeable exception is tmpfs/shmem. These pages do not belong to normal
> write back process. But the code path is used for other filesystems as
> well.
>
> Therefore, NAK. You only create unneeded write back traffic.

Hi Kirill,

(My first reaction was exactly like yours indeed, but) on my real
system (Cavium OcteonTX2), and on my qemu simulation I can reproduce
the same behavior: just mlock()ed memory causes faults. That faults
happen because page is mapped to the process as read-only, while
underlying VMA is read-write. So faults get resolved well by just
setting write access to the page.

Maybe I use term COW wrongly here, but this is how faultin_page()
works, and it sets FOLL_COW bit before return (which is ignored
on upper level).

I realize that proper fix may be more complex, and if so I'll
thankfully take it and drop this patch from my tree, but this is
all that I have so far to address the problem.

The user code below is reproducer.

Thanks,
Yury

int i, ret, len = getpagesize() * 1000;
char tmpfile[] = "/tmp/my_tmp-XXXXXX";
int fd = mkstemp(tmpfile);

ret = ftruncate(fd, len);
if (ret) {
printf("Failed to ftruncate: %d\n", errno);
goto out;
}

ptr = mmap(NULL, len, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if (ptr == MAP_FAILED) {
printf("Failed to mmap memory: %d\n", errno);
goto out;
}

ret = mlock(ptr, len);
if (ret) {
printf("Failed to mlock: %d\n", errno);
goto out;
}

printf("Touch...\n");

for (i = 0; i < len; i++)
ptr[i] = (char) i; /* Faults here. */

printf("\t... done\n");
out:
close(fd);
unlink(tmpfile);

2018-09-25 10:49:33

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [PATCH] mm: fix COW faults after mlock()

On Tue, Sep 25, 2018 at 02:48:43AM +0300, Yury Norov wrote:
> On Tue, Sep 25, 2018 at 12:22:47AM +0300, Kirill A. Shutemov wrote:
> > External Email
> >
> > On Mon, Sep 24, 2018 at 04:08:52PM +0300, Yury Norov wrote:
> > > After mlock() on newly mmap()ed shared memory I observe page faults.
> > >
> > > The problem is that populate_vma_page_range() doesn't set FOLL_WRITE
> > > flag for writable shared memory in mlock() path, arguing that like:
> > > /*
> > > * We want to touch writable mappings with a write fault in order
> > > * to break COW, except for shared mappings because these don't COW
> > > * and we would not want to dirty them for nothing.
> > > */
> > >
> > > But they are actually COWed. The most straightforward way to avoid it
> > > is to set FOLL_WRITE flag for shared mappings as well as for private ones.
> >
> > Huh? How do shared mapping get CoWed?
> >
> > In this context CoW means to create a private copy of the page for the
> > process. It only makes sense for private mappings as all pages in shared
> > mappings do not belong to the process.
> >
> > Shared mappings will still get faults, but a bit later -- after the page
> > is written back to disc, the page get clear and write protected to catch
> > the next write access.
> >
> > Noticeable exception is tmpfs/shmem. These pages do not belong to normal
> > write back process. But the code path is used for other filesystems as
> > well.
> >
> > Therefore, NAK. You only create unneeded write back traffic.
>
> Hi Kirill,
>
> (My first reaction was exactly like yours indeed, but) on my real
> system (Cavium OcteonTX2), and on my qemu simulation I can reproduce
> the same behavior: just mlock()ed memory causes faults. That faults
> happen because page is mapped to the process as read-only, while
> underlying VMA is read-write. So faults get resolved well by just
> setting write access to the page.

mlock() doesn't guarntee that you'll never get a *minor* fault. Write back
or page migration will get these pages write-protected.

Making pages write protected is what we rely on for proper dirty
accounting: filesystems need to know when page gets dirty and allocate
resources for properly write back the page. Once page is written back to
storage the page gets write protected again to catch the next write access
to the page.

I guess we can situation a bit better for shmem/tmpfs: we can populate
such shared mappings with FOLL_WRITE. But this patch is not good for the
task.

--
Kirill A. Shutemov

2018-10-11 05:39:03

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [mm] dd12385915: vm-scalability.median 18.6% improvement

Greeting,

FYI, we noticed a 18.6% improvement of vm-scalability.median due to commit:


commit: dd12385915f4f83b738467e599b053c33dffbd48 ("[PATCH] mm: fix COW faults after mlock()")
url: https://github.com/0day-ci/linux/commits/Yury-Norov/mm-fix-COW-faults-after-mlock/20180925-174527
base: https://github.com/thesofproject/linux master

in testcase: vm-scalability
on test machine: 80 threads Skylake with 64G memory
with following parameters:

runtime: 300s
size: 1T
test: msync
cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/300s/1T/lkp-ivb-d02/msync/vm-scalability/0x20

commit:
dd52cb8790 (" platform-drivers-x86 for v4.17-4")
dd12385915 ("mm: fix COW faults after mlock()")

dd52cb879063ca54 dd12385915f4f83b738467e599
---------------- --------------------------
fail:runs %reproduction fail:runs
| | |
1:4 -25% :4 dmesg.WARNING:at#for_ip_error_entry/0x
0:4 18% 1:4 perf-profile.calltrace.cycles-pp.error_entry
2:4 -18% 2:4 perf-profile.children.cycles-pp.error_entry
0:4 0% 0:4 perf-profile.children.cycles-pp.schedule_timeout
2:4 -17% 1:4 perf-profile.self.cycles-pp.error_entry
%stddev %change %stddev
\ | \
277824 ± 4% +18.6% 329586 ± 6% vm-scalability.median
0.46 ± 6% -46.9% 0.24 ± 27% vm-scalability.median_stddev
2817 ± 26% +5.7e+06% 1.608e+08 ± 6% vm-scalability.time.file_system_inputs
6509 ± 28% -73.7% 1714 ± 3% vm-scalability.time.major_page_faults
142.50 ± 5% -40.0% 85.50 ± 4% vm-scalability.time.percent_of_cpu_this_job_got
281.87 ± 6% -35.2% 182.76 ± 5% vm-scalability.time.system_time
152.54 ± 6% -46.3% 81.85 ± 5% vm-scalability.time.user_time
448928 ± 6% -31.1% 309105 ± 8% vm-scalability.time.voluntary_context_switches
2.093e+08 ± 6% -44.9% 1.154e+08 ± 5% vm-scalability.workload
18807972 ± 5% -27.3% 13664118 ± 6% interrupts.CAL:Function_call_interrupts
88890 ± 2% -10.4% 79618 ± 2% softirqs.RCU
100860 ± 2% +31.5% 132623 ± 2% softirqs.SCHED
9.00 +2.8e+06% 256051 ± 5% vmstat.io.bi
6.00 -50.0% 3.00 vmstat.memory.buff
67870 ± 4% -25.0% 50893 ± 5% vmstat.system.in
14.04 ± 8% +2.4 16.47 ± 7% mpstat.cpu.idle%
29.68 ± 7% +13.5 43.20 ± 2% mpstat.cpu.iowait%
42.68 ± 5% -10.1 32.62 ± 5% mpstat.cpu.sys%
13.58 ± 4% -5.9 7.70 ± 4% mpstat.cpu.usr%
294613 +88.5% 555281 slabinfo.buffer_head.active_objs
7559 +88.5% 14248 slabinfo.buffer_head.active_slabs
294837 +88.5% 555706 slabinfo.buffer_head.num_objs
7559 +88.5% 14248 slabinfo.buffer_head.num_slabs
817.75 ± 6% -9.7% 738.25 ± 7% slabinfo.proc_inode_cache.num_objs
454.00 ± 8% +24.4% 565.00 ± 18% slabinfo.skbuff_head_cache.active_objs
13651918 ± 29% +56.9% 21419299 ± 8% cpuidle.C1.time
1.931e+08 ± 6% +51.1% 2.919e+08 ± 2% cpuidle.C1E.time
508755 ± 6% +33.4% 678785 ± 2% cpuidle.C1E.usage
16723882 ± 4% +73.1% 28943328 cpuidle.C3.time
21421 ± 3% +78.8% 38298 cpuidle.C3.usage
3.002e+08 ± 5% +29.2% 3.879e+08 ± 4% cpuidle.C6.time
312454 ± 4% +28.8% 402526 ± 4% cpuidle.C6.usage
107432 ± 20% +61.4% 173441 ± 12% cpuidle.POLL.time
18727 ± 8% -25.9% 13876 ± 6% cpuidle.POLL.usage
1073224 ± 2% +17.7% 1263701 meminfo.Active(file)
308786 ± 3% +18.0% 364452 ± 2% meminfo.Dirty
1342759 ± 2% -20.9% 1061562 ± 3% meminfo.Inactive
172701 ± 16% -39.3% 104801 ± 3% meminfo.Inactive(anon)
1170057 -18.2% 956761 ± 3% meminfo.Inactive(file)
13762 -20.3% 10968 ± 2% meminfo.PageTables
85607 +29.8% 111114 meminfo.SReclaimable
107046 +23.8% 132534 meminfo.Slab
37561 ± 4% +23.5% 46379 ± 4% meminfo.Writeback
84144 ± 7% -26.0% 62266 ± 5% sched_debug.cfs_rq:/.exec_clock.avg
85958 ± 6% -24.6% 64822 ± 5% sched_debug.cfs_rq:/.exec_clock.max
82818 ± 7% -26.6% 60788 ± 5% sched_debug.cfs_rq:/.exec_clock.min
209438 ± 7% -41.9% 121723 ± 4% sched_debug.cfs_rq:/.min_vruntime.avg
218172 ± 8% -40.0% 130873 ± 3% sched_debug.cfs_rq:/.min_vruntime.max
199461 ± 6% -43.1% 113416 ± 7% sched_debug.cfs_rq:/.min_vruntime.min
1399 ± 2% -16.2% 1172 ± 11% sched_debug.cpu.curr->pid.stddev
15.87 ± 25% +104.2% 32.42 ± 35% sched_debug.cpu.nr_uninterruptible.max
-14.25 +91.2% -27.25 sched_debug.cpu.nr_uninterruptible.min
11.61 ± 28% +98.0% 22.99 ± 28% sched_debug.cpu.nr_uninterruptible.stddev
127503 ± 14% +26.0% 160648 ± 4% sched_debug.cpu.sched_goidle.min
0.00 ± 94% -91.3% 0.00 ±158% sched_debug.rt_rq:/.rt_time.stddev
1879 ± 4% -27.7% 1358 ± 5% turbostat.Avg_MHz
57.42 ± 4% -15.7 41.70 ± 4% turbostat.Busy%
1.11 ± 30% +0.6 1.71 ± 7% turbostat.C1%
508750 ± 6% +33.4% 678785 ± 2% turbostat.C1E
15.71 ± 7% +7.6 23.32 ± 2% turbostat.C1E%
21419 ± 3% +78.8% 38298 turbostat.C3
1.36 ± 5% +1.0 2.31 turbostat.C3%
312453 ± 4% +28.8% 402527 ± 4% turbostat.C6
24.42 ± 6% +6.6 31.00 ± 5% turbostat.C6%
30.15 ± 8% +39.3% 42.01 ± 3% turbostat.CPU%c1
1.09 ± 5% +60.0% 1.75 ± 6% turbostat.CPU%c3
11.33 ± 9% +28.3% 14.54 ± 6% turbostat.CPU%c6
12.77 ± 3% -18.3% 10.43 ± 3% turbostat.CorWatt
39798103 ± 5% -25.5% 29652069 ± 6% turbostat.IRQ
30.09 -8.0% 27.69 turbostat.PkgWatt
4.902e+11 ± 5% -41.5% 2.869e+11 ± 28% perf-stat.branch-instructions
3.636e+09 ± 8% -32.3% 2.461e+09 ± 28% perf-stat.branch-misses
66.28 -7.3 58.95 perf-stat.cache-miss-rate%
9.761e+09 ± 7% -26.2% 7.2e+09 ± 28% perf-stat.cache-misses
1.11 +5.9% 1.17 perf-stat.cpi
2.298e+12 ± 6% -35.6% 1.48e+12 ± 27% perf-stat.cpu-cycles
19198 ± 5% +14.7% 22028 ± 5% perf-stat.cpu-migrations
3.858e+09 ± 15% -42.9% 2.205e+09 ± 16% perf-stat.dTLB-load-misses
5.213e+11 ± 5% -36.1% 3.332e+11 ± 28% perf-stat.dTLB-loads
0.12 ± 9% -0.0 0.09 ± 18% perf-stat.dTLB-store-miss-rate%
3.446e+08 ± 10% -52.5% 1.638e+08 ± 17% perf-stat.dTLB-store-misses
2.839e+11 ± 5% -35.2% 1.839e+11 ± 28% perf-stat.dTLB-stores
74.39 +7.5 81.93 perf-stat.iTLB-load-miss-rate%
3.687e+08 ± 8% -45.1% 2.024e+08 ± 27% perf-stat.iTLB-load-misses
1.272e+08 ± 10% -65.7% 43583231 ± 21% perf-stat.iTLB-loads
2.077e+12 ± 5% -39.0% 1.266e+12 ± 28% perf-stat.instructions
5646 ± 3% +10.3% 6228 ± 4% perf-stat.instructions-per-iTLB-miss
0.90 -5.5% 0.85 perf-stat.ipc
47044927 ± 6% -44.5% 26119814 ± 5% perf-stat.minor-faults
47046188 ± 6% -44.5% 26120941 ± 5% perf-stat.page-faults
22566 ± 39% -78.5% 4857 ± 4% proc-vmstat.allocstall_movable
120471 ± 14% -86.8% 15866 ± 5% proc-vmstat.allocstall_normal
7661 ± 28% -78.7% 1631 ± 11% proc-vmstat.compact_fail
7682 ± 28% -78.4% 1659 ± 11% proc-vmstat.compact_stall
2752 ± 5% -37.9% 1710 ± 4% proc-vmstat.kswapd_low_wmark_hit_quickly
267103 ± 2% +17.7% 314329 ± 2% proc-vmstat.nr_active_file
78076 ± 3% +17.3% 91559 proc-vmstat.nr_dirty
7811 ± 6% -21.7% 6118 ± 28% proc-vmstat.nr_free_cma
43289 ± 16% -39.4% 26248 ± 3% proc-vmstat.nr_inactive_anon
293783 -17.7% 241780 ± 4% proc-vmstat.nr_inactive_file
3445 -20.5% 2740 proc-vmstat.nr_page_table_pages
21416 +29.8% 27796 proc-vmstat.nr_slab_reclaimable
9153 ± 2% +26.1% 11542 ± 2% proc-vmstat.nr_writeback
267072 ± 2% +17.7% 314320 ± 2% proc-vmstat.nr_zone_active_file
43289 ± 16% -39.4% 26251 ± 3% proc-vmstat.nr_zone_inactive_anon
293694 -17.7% 241712 ± 4% proc-vmstat.nr_zone_inactive_file
87180 ± 3% +18.2% 103074 proc-vmstat.nr_zone_write_pending
85522443 ± 6% -43.5% 48347251 ± 5% proc-vmstat.numa_hit
85522443 ± 6% -43.5% 48347251 ± 5% proc-vmstat.numa_local
2958 ± 5% -35.3% 1913 ± 3% proc-vmstat.pageoutrun
51287873 ± 6% -43.2% 29135174 ± 5% proc-vmstat.pgactivate
28504604 ± 8% -18.2% 23322955 ± 5% proc-vmstat.pgalloc_dma32
57168735 ± 6% -56.0% 25151136 ± 8% proc-vmstat.pgalloc_normal
41153300 ± 7% -46.5% 22026100 ± 6% proc-vmstat.pgdeactivate
85685675 ± 6% -43.4% 48485032 ± 5% proc-vmstat.pgfree
6509 ± 28% -73.7% 1714 ± 3% proc-vmstat.pgmajfault
2976 +2.7e+06% 80419921 ± 6% proc-vmstat.pgpgin
41153300 ± 7% -46.5% 22026099 ± 6% proc-vmstat.pgrefill
477249 ± 6% +72.3% 822527 ± 11% proc-vmstat.pgrotated
43943485 ± 5% -52.5% 20878181 ± 5% proc-vmstat.pgscan_direct
84942346 ± 6% -41.7% 49530834 ± 5% proc-vmstat.pgscan_kswapd
8102736 ± 7% -85.2% 1196177 ± 5% proc-vmstat.pgsteal_direct
47558626 ± 6% -39.2% 28921712 ± 6% proc-vmstat.pgsteal_kswapd
14421 -13.1% 12538 ± 3% proc-vmstat.slabs_scanned
2020457 ± 15% -66.2% 682455 ± 8% proc-vmstat.workingset_activate
17434531 ± 8% -44.0% 9754788 ± 5% proc-vmstat.workingset_refault
19.94 ± 25% -17.5 2.42 ±110% perf-profile.calltrace.cycles-pp.do_access
12.24 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.page_fault.do_access
12.22 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.do_access
12.21 ± 26% -10.7 1.50 ±109% perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.do_access
11.34 ± 26% -10.0 1.37 ±109% perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.do_access
16.17 ± 23% -8.6 7.58 ± 25% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
9.39 ± 27% -8.1 1.29 ±110% perf-profile.calltrace.cycles-pp.do_rw_once
7.59 ± 23% -4.5 3.05 ± 30% perf-profile.calltrace.cycles-pp.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
7.53 ± 23% -4.5 3.00 ± 30% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__do_page_fault
6.69 ± 24% -3.3 3.44 ± 23% perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
6.55 ± 24% -3.2 3.38 ± 23% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
14.30 ± 5% -3.0 11.27 ± 7% perf-profile.calltrace.cycles-pp.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault.__do_fault
14.31 ± 5% -3.0 11.29 ± 8% perf-profile.calltrace.cycles-pp.ondemand_readahead.filemap_fault.__xfs_filemap_fault.__do_fault.__handle_mm_fault
15.40 ± 3% -2.7 12.73 ± 6% perf-profile.calltrace.cycles-pp.filemap_fault.__xfs_filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault
2.67 ± 41% -2.1 0.56 ±116% perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.filemap_fault
4.15 ± 27% -2.1 2.10 ± 12% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault
2.46 ± 19% -1.6 0.89 ± 68% perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
3.60 ± 24% -1.5 2.06 ± 40% perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
9.38 ± 3% -1.4 7.95 ± 14% perf-profile.calltrace.cycles-pp.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.filemap_fault.__xfs_filemap_fault
3.54 ± 3% -0.6 2.95 ± 15% perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.mpage_readpages.__do_page_cache_readahead.ondemand_readahead.filemap_fault
0.31 ±100% +0.5 0.81 ± 17% perf-profile.calltrace.cycles-pp.xfs_bmap_add_extent_hole_delay.xfs_bmapi_reserve_delalloc.xfs_file_iomap_begin.iomap_apply.iomap_page_mkwrite
0.00 +0.6 0.63 ± 22% perf-profile.calltrace.cycles-pp.follow_page_pte.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff
0.34 ±102% +0.8 1.10 ± 40% perf-profile.calltrace.cycles-pp.__set_page_dirty.mark_buffer_dirty.__block_commit_write.block_commit_write.iomap_page_mkwrite_actor
0.59 ± 66% +1.0 1.59 ± 26% perf-profile.calltrace.cycles-pp.mark_buffer_dirty.__block_commit_write.block_commit_write.iomap_page_mkwrite_actor.iomap_apply
0.00 +1.1 1.06 ± 36% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__get_user_pages
0.00 +1.1 1.11 ± 34% perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range
2.14 ± 34% +1.2 3.35 ± 19% perf-profile.calltrace.cycles-pp.xfs_file_iomap_begin.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite
1.14 ± 34% +1.2 2.35 ± 47% perf-profile.calltrace.cycles-pp.kmem_cache_alloc.alloc_buffer_head.alloc_page_buffers.create_empty_buffers.create_page_buffers
1.24 ± 34% +1.2 2.46 ± 45% perf-profile.calltrace.cycles-pp.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor
1.19 ± 33% +1.2 2.43 ± 46% perf-profile.calltrace.cycles-pp.alloc_buffer_head.alloc_page_buffers.create_empty_buffers.create_page_buffers.__block_write_begin_int
1.57 ± 33% +1.4 2.98 ± 35% perf-profile.calltrace.cycles-pp.create_empty_buffers.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply
1.60 ± 34% +1.4 3.04 ± 33% perf-profile.calltrace.cycles-pp.create_page_buffers.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite
1.90 ± 32% +2.0 3.86 ± 38% perf-profile.calltrace.cycles-pp.__block_write_begin_int.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault
0.00 +2.1 2.09 ± 50% perf-profile.calltrace.cycles-pp.memcpy_erms.memcpy_to_page._copy_to_iter.copy_page_to_iter.shmem_file_read_iter
0.00 +2.1 2.11 ± 50% perf-profile.calltrace.cycles-pp.memcpy_to_page._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev
0.00 +2.2 2.19 ± 50% perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev.do_iter_read
0.00 +2.2 2.22 ± 50% perf-profile.calltrace.cycles-pp.copy_page_to_iter.shmem_file_read_iter.do_iter_readv_writev.do_iter_read.loop_queue_work
3.25 ± 30% +2.9 6.13 ± 26% perf-profile.calltrace.cycles-pp.iomap_page_mkwrite_actor.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite
0.00 +3.0 3.00 ± 51% perf-profile.calltrace.cycles-pp.shmem_file_read_iter.do_iter_readv_writev.do_iter_read.loop_queue_work.kthread_worker_fn
0.00 +3.1 3.06 ± 51% perf-profile.calltrace.cycles-pp.do_iter_readv_writev.do_iter_read.loop_queue_work.kthread_worker_fn.kthread
0.00 +3.3 3.26 ± 52% perf-profile.calltrace.cycles-pp.do_iter_read.loop_queue_work.kthread_worker_fn.kthread.ret_from_fork
6.10 ± 23% +4.1 10.16 ± 21% perf-profile.calltrace.cycles-pp.iomap_apply.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault
6.60 ± 23% +4.4 11.01 ± 19% perf-profile.calltrace.cycles-pp.iomap_page_mkwrite.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault
3.01 ± 25% +5.9 8.94 ± 38% perf-profile.calltrace.cycles-pp.secondary_startup_64
0.00 +9.5 9.53 ± 27% perf-profile.calltrace.cycles-pp.__xfs_filemap_fault.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__get_user_pages
0.00 +9.6 9.59 ± 27% perf-profile.calltrace.cycles-pp.do_page_mkwrite.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range
11.83 ± 21% +10.1 21.96 ± 18% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate
11.85 ± 21% +10.4 22.23 ± 18% perf-profile.calltrace.cycles-pp.handle_mm_fault.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff
12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.__get_user_pages.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff
12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe
12.52 ± 21% +11.2 23.77 ± 17% perf-profile.calltrace.cycles-pp.populate_vma_page_range.__mm_populate.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64
19.94 ± 25% -17.5 2.42 ±110% perf-profile.children.cycles-pp.do_access
17.94 ± 23% -9.2 8.71 ± 25% perf-profile.children.cycles-pp.__do_page_fault
17.89 ± 23% -9.2 8.69 ± 25% perf-profile.children.cycles-pp.do_page_fault
17.91 ± 23% -9.2 8.73 ± 25% perf-profile.children.cycles-pp.page_fault
9.39 ± 27% -8.1 1.29 ±110% perf-profile.children.cycles-pp.do_rw_once
14.54 ± 6% -3.3 11.29 ± 8% perf-profile.children.cycles-pp.__do_page_cache_readahead
14.52 ± 5% -3.2 11.29 ± 7% perf-profile.children.cycles-pp.ondemand_readahead
15.55 ± 4% -2.8 12.78 ± 7% perf-profile.children.cycles-pp.filemap_fault
16.04 ± 4% -2.6 13.46 ± 6% perf-profile.children.cycles-pp.__do_fault
7.32 ± 17% -1.6 5.70 ± 14% perf-profile.children.cycles-pp.__alloc_pages_nodemask
5.75 ± 23% -1.6 4.13 ± 17% perf-profile.children.cycles-pp.__alloc_pages_slowpath
5.38 ± 22% -1.5 3.85 ± 16% perf-profile.children.cycles-pp.try_to_free_pages
5.38 ± 22% -1.5 3.85 ± 16% perf-profile.children.cycles-pp.do_try_to_free_pages
4.18 ± 7% -1.4 2.76 ± 29% perf-profile.children.cycles-pp.page_vma_mapped_walk
9.57 -1.4 8.19 ± 12% perf-profile.children.cycles-pp.mpage_readpages
1.76 ± 30% -1.3 0.48 ± 69% perf-profile.children.cycles-pp.pte_alloc_one
2.91 ± 17% -1.1 1.77 ± 35% perf-profile.children.cycles-pp.page_referenced_one
0.79 ± 22% -0.7 0.07 ±113% perf-profile.children.cycles-pp.filemap_map_pages
0.65 ± 39% -0.6 0.08 ± 70% perf-profile.children.cycles-pp.shrink_slab
1.12 ± 32% -0.5 0.66 ± 29% perf-profile.children.cycles-pp.free_pcppages_bulk
1.27 ± 36% -0.5 0.82 ± 38% perf-profile.children.cycles-pp.free_unref_page_list
3.68 ± 2% -0.4 3.24 ± 9% perf-profile.children.cycles-pp.add_to_page_cache_lru
0.75 ± 22% -0.3 0.46 ± 16% perf-profile.children.cycles-pp.__radix_tree_replace
0.30 ± 24% -0.1 0.15 ± 53% perf-profile.children.cycles-pp.find_vma
1.26 ± 3% -0.1 1.12 ± 6% perf-profile.children.cycles-pp.pagevec_lru_move_fn
0.29 ± 9% -0.1 0.15 ± 23% perf-profile.children.cycles-pp.replace_slot
1.31 ± 3% -0.1 1.17 ± 8% perf-profile.children.cycles-pp.__lru_cache_add
0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_local_pages_wq
0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_pages
0.20 ± 27% -0.1 0.06 ± 65% perf-profile.children.cycles-pp.drain_pages_zone
0.83 ± 11% -0.1 0.69 ± 16% perf-profile.children.cycles-pp.page_cache_tree_insert
0.59 ± 7% -0.1 0.46 ± 20% perf-profile.children.cycles-pp.unmap_page_range
0.58 ± 5% -0.1 0.46 ± 19% perf-profile.children.cycles-pp.unmap_vmas
0.21 ± 18% -0.1 0.10 ± 50% perf-profile.children.cycles-pp.page_mapped
0.13 ± 38% -0.1 0.04 ± 58% perf-profile.children.cycles-pp.super_cache_count
0.16 ± 19% -0.1 0.09 ± 27% perf-profile.children.cycles-pp.ptep_clear_flush_young
0.09 ± 27% -0.1 0.03 ±100% perf-profile.children.cycles-pp.down_read_trylock
0.00 +0.1 0.06 ± 11% perf-profile.children.cycles-pp.find_next_and_bit
0.08 ± 32% +0.1 0.14 ± 34% perf-profile.children.cycles-pp.task_tick_fair
0.01 ±173% +0.1 0.08 ± 24% perf-profile.children.cycles-pp.cpumask_next_and
0.07 ± 59% +0.1 0.13 ± 30% perf-profile.children.cycles-pp.pmd_devmap_trans_unstable
0.02 ±173% +0.1 0.09 ± 21% perf-profile.children.cycles-pp.xfs_fsb_to_db
0.04 ±101% +0.1 0.14 ± 18% perf-profile.children.cycles-pp.kthread_queue_work
0.06 ±101% +0.1 0.15 ± 11% perf-profile.children.cycles-pp.loop_queue_rq
0.09 ± 28% +0.1 0.19 ± 27% perf-profile.children.cycles-pp.memset_erms
0.11 ± 38% +0.1 0.22 ± 15% perf-profile.children.cycles-pp.current_kernel_time64
0.07 ± 70% +0.1 0.19 ± 7% perf-profile.children.cycles-pp.blk_mq_dispatch_rq_list
0.00 +0.1 0.12 ± 65% perf-profile.children.cycles-pp.touch_atime
0.14 ± 18% +0.1 0.27 ± 31% perf-profile.children.cycles-pp.xfs_file_iomap_end
0.11 ± 70% +0.1 0.23 ± 8% perf-profile.children.cycles-pp.blk_mq_run_hw_queue
0.10 ± 69% +0.1 0.23 ± 6% perf-profile.children.cycles-pp.__blk_mq_run_hw_queue
0.10 ± 69% +0.1 0.23 ± 6% perf-profile.children.cycles-pp.blk_mq_sched_dispatch_requests
0.10 ± 65% +0.1 0.23 ± 7% perf-profile.children.cycles-pp.blk_mq_do_dispatch_sched
0.19 ± 15% +0.1 0.34 ± 18% perf-profile.children.cycles-pp.current_time
0.35 ± 25% +0.2 0.51 ± 14% perf-profile.children.cycles-pp.try_to_wake_up
0.48 ± 19% +0.2 0.63 ± 5% perf-profile.children.cycles-pp.set_page_dirty
0.28 ± 19% +0.2 0.44 ± 24% perf-profile.children.cycles-pp.xfs_vn_update_time
0.07 ±101% +0.2 0.23 ± 7% perf-profile.children.cycles-pp.__blk_mq_delay_run_hw_queue
0.09 ± 43% +0.2 0.25 ± 46% perf-profile.children.cycles-pp.update_load_avg
0.80 ± 9% +0.2 0.98 ± 6% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.55 ± 17% +0.2 0.73 ± 3% perf-profile.children.cycles-pp.radix_tree_tag_clear
0.46 ± 20% +0.2 0.65 ± 5% perf-profile.children.cycles-pp.xfs_iunlock
0.09 ±102% +0.2 0.28 ± 15% perf-profile.children.cycles-pp.blk_mq_flush_plug_list
0.10 ± 79% +0.2 0.30 ± 11% perf-profile.children.cycles-pp.blk_flush_plug_list
0.42 ± 20% +0.2 0.61 ± 19% perf-profile.children.cycles-pp.tick_sched_timer
0.49 ± 13% +0.2 0.69 ± 16% perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
0.00 +0.2 0.20 ± 19% perf-profile.children.cycles-pp.blk_finish_plug
0.00 +0.2 0.23 ± 61% perf-profile.children.cycles-pp.mpage_end_io
1.46 ± 13% +0.2 1.68 ± 4% perf-profile.children.cycles-pp.__radix_tree_lookup
0.43 ± 14% +0.3 0.70 ± 18% perf-profile.children.cycles-pp.follow_page_pte
0.37 ± 28% +0.3 0.64 ± 18% perf-profile.children.cycles-pp.blk_mq_make_request
0.15 ± 39% +0.3 0.42 ± 83% perf-profile.children.cycles-pp.menu_select
0.65 ± 8% +0.3 0.94 ± 20% perf-profile.children.cycles-pp.file_update_time
0.45 ± 12% +0.3 0.76 ± 28% perf-profile.children.cycles-pp.radix_tree_tag_set
0.39 ± 28% +0.3 0.70 ± 17% perf-profile.children.cycles-pp.submit_bio
0.39 ± 28% +0.3 0.70 ± 17% perf-profile.children.cycles-pp.generic_make_request
0.73 ± 10% +0.4 1.11 ± 8% perf-profile.children.cycles-pp.radix_tree_lookup_slot
0.07 ±119% +0.4 0.46 ± 67% perf-profile.children.cycles-pp.__pte_alloc
0.65 ± 14% +0.4 1.06 ± 36% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.22 ± 37% +0.4 0.64 ± 12% perf-profile.children.cycles-pp.follow_page_mask
0.92 ± 6% +0.5 1.46 ± 21% perf-profile.children.cycles-pp.alloc_set_pte
1.04 ± 8% +0.5 1.59 ± 19% perf-profile.children.cycles-pp.xfs_vm_set_page_dirty
1.18 ± 8% +0.7 1.83 ± 10% perf-profile.children.cycles-pp.find_get_entry
0.69 ± 11% +0.7 1.42 ± 19% perf-profile.children.cycles-pp.finish_fault
0.00 +0.8 0.77 ± 70% perf-profile.children.cycles-pp.mempool_alloc
0.00 +0.8 0.77 ± 71% perf-profile.children.cycles-pp.bio_alloc_bioset
1.22 ± 16% +0.8 2.06 ± 50% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
1.00 ± 28% +0.9 1.86 ± 20% perf-profile.children.cycles-pp.mark_buffer_dirty
1.22 ± 15% +0.9 2.09 ± 50% perf-profile.children.cycles-pp.apic_timer_interrupt
1.22 ± 25% +1.0 2.23 ± 16% perf-profile.children.cycles-pp.__block_commit_write
1.18 ± 26% +1.0 2.22 ± 16% perf-profile.children.cycles-pp.block_commit_write
2.56 ± 24% +1.1 3.69 ± 16% perf-profile.children.cycles-pp.xfs_file_iomap_begin
1.32 ± 27% +1.4 2.77 ± 41% perf-profile.children.cycles-pp.alloc_buffer_head
1.41 ± 26% +1.5 2.87 ± 41% perf-profile.children.cycles-pp.alloc_page_buffers
1.75 ± 26% +1.5 3.29 ± 36% perf-profile.children.cycles-pp.create_empty_buffers
1.78 ± 27% +1.6 3.34 ± 34% perf-profile.children.cycles-pp.create_page_buffers
2.17 ± 24% +1.8 4.00 ± 35% perf-profile.children.cycles-pp.__block_write_begin_int
0.84 ± 36% +1.9 2.71 ± 31% perf-profile.children.cycles-pp.new_slab
0.88 ± 35% +1.9 2.76 ± 30% perf-profile.children.cycles-pp.___slab_alloc
0.88 ± 35% +1.9 2.76 ± 30% perf-profile.children.cycles-pp.__slab_alloc
0.00 +2.1 2.11 ± 50% perf-profile.children.cycles-pp.memcpy_to_page
0.00 +2.2 2.20 ± 50% perf-profile.children.cycles-pp._copy_to_iter
1.41 ± 25% +2.2 3.61 ± 26% perf-profile.children.cycles-pp.kmem_cache_alloc
0.00 +2.3 2.25 ± 50% perf-profile.children.cycles-pp.copy_page_to_iter
3.42 ± 23% +3.0 6.38 ± 26% perf-profile.children.cycles-pp.iomap_page_mkwrite_actor
0.00 +3.0 3.01 ± 51% perf-profile.children.cycles-pp.shmem_file_read_iter
0.00 +3.3 3.30 ± 53% perf-profile.children.cycles-pp.do_iter_read
6.12 ± 23% +4.2 10.27 ± 21% perf-profile.children.cycles-pp.iomap_apply
2.14 ± 22% +4.3 6.46 ± 22% perf-profile.children.cycles-pp.intel_idle
6.65 ± 24% +4.4 11.08 ± 19% perf-profile.children.cycles-pp.iomap_page_mkwrite
7.61 ± 24% +5.0 12.65 ± 17% perf-profile.children.cycles-pp.do_page_mkwrite
2.75 ± 26% +5.5 8.29 ± 35% perf-profile.children.cycles-pp.cpuidle_enter_state
3.01 ± 25% +5.9 8.94 ± 38% perf-profile.children.cycles-pp.secondary_startup_64
3.01 ± 25% +5.9 8.94 ± 38% perf-profile.children.cycles-pp.cpu_startup_entry
3.01 ± 25% +6.0 8.97 ± 38% perf-profile.children.cycles-pp.do_idle
16.28 ± 22% +10.9 27.21 ± 15% perf-profile.children.cycles-pp.do_syscall_64
16.28 ± 22% +11.0 27.24 ± 15% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
12.56 ± 21% +11.2 23.80 ± 17% perf-profile.children.cycles-pp.vm_mmap_pgoff
12.52 ± 21% +11.2 23.77 ± 17% perf-profile.children.cycles-pp.__mm_populate
12.52 ± 21% +11.2 23.77 ± 17% perf-profile.children.cycles-pp.populate_vma_page_range
12.55 ± 21% +11.3 23.80 ± 17% perf-profile.children.cycles-pp.ksys_mmap_pgoff
12.54 ± 21% +11.3 23.81 ± 17% perf-profile.children.cycles-pp.__get_user_pages
9.01 ± 26% -7.8 1.25 ±110% perf-profile.self.cycles-pp.do_rw_once
3.84 ± 24% -3.4 0.42 ±108% perf-profile.self.cycles-pp.do_access
2.89 ± 9% -1.0 1.88 ± 30% perf-profile.self.cycles-pp.page_vma_mapped_walk
4.27 ± 3% -0.9 3.35 ± 18% perf-profile.self.cycles-pp.do_mpage_readpage
1.14 ± 16% -0.4 0.70 ± 41% perf-profile.self.cycles-pp.page_referenced_one
0.39 ± 11% -0.1 0.24 ± 28% perf-profile.self.cycles-pp.xfs_get_blocks
0.29 ± 9% -0.1 0.15 ± 23% perf-profile.self.cycles-pp.replace_slot
0.40 ± 20% -0.1 0.26 ± 33% perf-profile.self.cycles-pp.page_referenced
0.21 ± 20% -0.1 0.10 ± 50% perf-profile.self.cycles-pp.page_mapped
0.16 ± 19% -0.1 0.09 ± 27% perf-profile.self.cycles-pp.ptep_clear_flush_young
0.09 ± 28% -0.1 0.03 ±100% perf-profile.self.cycles-pp.down_read_trylock
0.17 ± 3% -0.0 0.14 ± 12% perf-profile.self.cycles-pp.mem_cgroup_commit_charge
0.07 ± 14% +0.0 0.12 ± 21% perf-profile.self.cycles-pp.kmem_cache_free
0.00 +0.1 0.06 ± 11% perf-profile.self.cycles-pp.find_next_and_bit
0.09 ± 27% +0.1 0.15 ± 19% perf-profile.self.cycles-pp.iomap_page_mkwrite_actor
0.01 ±173% +0.1 0.08 ± 23% perf-profile.self.cycles-pp.radix_tree_lookup_slot
0.02 ±173% +0.1 0.09 ± 21% perf-profile.self.cycles-pp.xfs_fsb_to_db
0.09 ± 36% +0.1 0.16 ± 14% perf-profile.self.cycles-pp.__mark_inode_dirty
0.09 ± 28% +0.1 0.16 ± 31% perf-profile.self.cycles-pp.memset_erms
0.01 ±173% +0.1 0.08 ± 48% perf-profile.self.cycles-pp.memcpy_from_page
0.21 ± 23% +0.1 0.28 ± 12% perf-profile.self.cycles-pp.lock_page_memcg
0.13 ± 32% +0.1 0.22 ± 30% perf-profile.self.cycles-pp.mark_buffer_dirty
0.16 ± 5% +0.1 0.25 ± 12% perf-profile.self.cycles-pp.__block_commit_write
0.15 ± 33% +0.1 0.25 ± 32% perf-profile.self.cycles-pp.xfs_add_to_ioend
0.18 ± 23% +0.1 0.28 ± 15% perf-profile.self.cycles-pp.set_page_dirty
0.10 ± 36% +0.1 0.22 ± 15% perf-profile.self.cycles-pp.current_kernel_time64
0.07 ± 69% +0.1 0.19 ± 47% perf-profile.self.cycles-pp.security_file_permission
0.05 ± 63% +0.1 0.18 ± 71% perf-profile.self.cycles-pp.update_load_avg
0.13 ± 27% +0.1 0.27 ± 24% perf-profile.self.cycles-pp.__xfs_filemap_fault
0.23 ± 27% +0.1 0.38 ± 26% perf-profile.self.cycles-pp.xfs_ilock
0.44 ± 20% +0.1 0.59 ± 5% perf-profile.self.cycles-pp.radix_tree_tag_clear
0.17 ± 29% +0.2 0.34 ± 15% perf-profile.self.cycles-pp.follow_page_pte
0.00 +0.2 0.18 ± 43% perf-profile.self.cycles-pp.shmem_file_read_iter
1.43 ± 12% +0.2 1.62 ± 3% perf-profile.self.cycles-pp.__radix_tree_lookup
0.16 ± 42% +0.2 0.36 ± 33% perf-profile.self.cycles-pp.xfs_bmapi_reserve_delalloc
0.33 ± 19% +0.2 0.53 ± 29% perf-profile.self.cycles-pp.kmem_cache_alloc
0.53 ± 9% +0.2 0.76 ± 10% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.37 ± 21% +0.2 0.61 ± 23% perf-profile.self.cycles-pp.__block_write_begin_int
0.20 ± 38% +0.3 0.48 ± 13% perf-profile.self.cycles-pp.follow_page_mask
0.05 ±112% +0.3 0.34 ± 51% perf-profile.self.cycles-pp.__get_user_pages
0.45 ± 12% +0.3 0.76 ± 28% perf-profile.self.cycles-pp.radix_tree_tag_set
0.46 ± 15% +0.3 0.79 ± 11% perf-profile.self.cycles-pp.find_get_entry
0.46 ± 33% +0.4 0.82 ± 18% perf-profile.self.cycles-pp.xfs_file_iomap_begin
0.61 ± 14% +0.4 0.98 ± 16% perf-profile.self.cycles-pp.__handle_mm_fault
2.14 ± 21% +4.3 6.46 ± 22% perf-profile.self.cycles-pp.intel_idle



vm-scalability.time.user_time

280 +-+-------------------------------------------------------------------+
| + + .+..+.+.. .+.. .+.. + +.. |
260 +-+ + + .+. +. +..+.+. +.. + +.. .+ |
240 +-+ + +. + +. |
| + |
220 +-+ |
| |
200 +-+ |
| |
180 +-+ |
160 +-+ |
| O O O O O O O |
140 O-+ O O O O O O O O O O O O O O O
| O O O |
120 +-+-------------------------------------------------------------------+


vm-scalability.time.system_time

2600 +-+------------------------------------------------------------------+
| +.. .+. .+ .+.. |
2400 +-+ .. +. +..+..+. + .+. +.+..+..+ |
2200 +-+ +.+.. .+.+ +. |
| .. +..+. |
2000 +-++ |
| |
1800 +-+ |
| |
1600 +-+ |
1400 +-+ |
| O O |
1200 O-+O O O O O O O |
| O O O O O O O O O O O O O O O O
1000 +-+------------------------------------------------------------------+


vm-scalability.time.percent_of_cpu_this_job_got

900 +-+-------------------------------------------------------------------+
850 +-+ .+.. .+.. .+. .+.. .+.+.. |
|.. .+.. .+ +. +. +. +..+. +..+..+ |
800 +-+ .+ +..+..+. |
750 +-++. |
700 +-+ |
650 +-+ |
| |
600 +-+ |
550 +-+ |
500 +-+ |
450 +-+ O |
O O O O O O O O O O O O O |
400 +-+ O O O O O O O O O O O O
350 +-+-------------------------------------------------------------------+


vm-scalability.time.major_page_faults

8000 +-+------------------------------------------------------------------+
| : |
7000 +-+ : |
| : : |
|: : |
6000 +-+ : +.. .+. .+..+. + +.. |
| :.+.. + +. +..+. +..+.. .. + .. |
5000 +-+ + +..+.. + + + + |
| +.+ |
4000 +-+ |
| |
| O O |
3000 +-+ O O O O O O O O O
O O O O O O O O O O O O O O |
2000 +-+--------------------------------------O---------------------------+


vm-scalability.time.file_system_inputs

3e+08 +-+---------------------------------------------------------------+
| |
2.5e+08 +-+ O O |
| |
| O O O O
2e+08 +-+ O O |
| |
1.5e+08 +-+ |
| |
1e+08 +-+ |
O O O O O O O O O O O O O O |
| O O O O |
5e+07 +-+ |
| |
0 +-+---------------------------------------------------------------+


vm-scalability.time.file_system_outputs

1.6e+09 +-+O-O--O--O------O-----O-O-----O----O-------O--------------------+
| |
1.5e+09 +-+ |
O O O O O O O O |
1.4e+09 +-+ |
| |
1.3e+09 +-+ O O O O O O O O
| |
1.2e+09 +-+ |
| |
1.1e+09 +-+ |
|.. .+..+.. .+.+..+.. .+.. .+.. .+.. |
1e+09 +-++ +.+. + +..+.+. +.+. +.+..+.+ |
| |
9e+08 +-+---------------------------------------------------------------+


vm-scalability.workload

6e+08 +-+---------------------------------------------------------------+
|.. +..+.. +.+..+.. +.. +.. +.. |
| + .. + .. .. |
5.5e+08 +-++ +.+ + +..+.+ +.+ +.+..+.+ |
| |
| |
5e+08 +-+ |
| |
4.5e+08 +-+ |
| O O O O O O O O O O |
| |
4e+08 O-+ O O O O O O O |
| |
| O O O O O O O O
3.5e+08 +-+---------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


Attachments:
(No filename) (46.81 kB)
config-4.17.0-rc7-00044-gdd12385 (167.16 kB)
job-script (7.31 kB)
job.yaml (4.95 kB)
reproduce (157.71 kB)
Download all attachments