2019-02-04 05:25:20

by John Hubbard

[permalink] [raw]
Subject: [PATCH 0/6] RFC v2: mm: gup/dma tracking

From: John Hubbard <[email protected]>

Hi,

I'm calling this RFC v2, even though with all the discussion it actually
feels
like about v7 or so. But now that the dust has settled, it's time to show a
surprisingly small, cleaner approach. Jan and Jerome came up with a scheme
(discussed in more detail in "track gup-pinned pages" commit description)
that
does not require any additional struct page fields. This approach has the
additional advantage of being very lightweight and therefore fast, because

a) it mostly just does atomics,

b) unlike previous approaches, there is no need to remove and re-add to
LRUs.

c) it uses the same lock-free algorithms that get_user_pages already
relies upon.

This RFC shows the following:

1) A patch to get the call site conversion started:

mm: introduce put_user_page*(), placeholder versions

2) A sample call site conversion:

infiniband/mm: convert put_page() to put_user_page*()

...NOT shown: all of the other 100+ gup call site conversions. Again,
those are in various states of progress and disrepair, at [1].

3) Tracking, instrumentation, and documentation patches, once all the call
sites have been converted.

4) A small refactoring patch that I'm also going to submit separately, for
the page_cache_add_speculative() routine.

This seems to be working pretty well here. I've converted enough call sites
(there is git repo [1] with that, which gets rebased madly, but it's there if
you really want to try some early testing) to run things such as fio.

Performance: here is an fio run on an NVMe drive, using this for the fio
configuration file:

[reader]
direct=1
ioengine=libaio
blocksize=4096
size=1g
numjobs=1
rw=read
iodepth=64

reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
Jobs: 1 (f=1)
reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb 3 20:36:51 2019
read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec)
slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10
clat (usec): min=20, max=12485, avg=332.63, stdev=191.77
lat (usec): min=22, max=12498, avg=336.72, stdev=192.07
clat percentiles (usec):
| 1.00th=[ 322], 5.00th=[ 322], 10.00th=[ 322], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 330], 90.00th=[ 330], 95.00th=[ 330],
| 99.00th=[ 478], 99.50th=[ 717], 99.90th=[ 1074], 99.95th=[ 1090],
| 99.99th=[12256]
bw ( KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00, stdev=32781.47, samples=2
iops : min=182538, max=194128, avg=188333.00, stdev=8195.37, samples=2
lat (usec) : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38%
lat (usec) : 1000=0.02%
lat (msec) : 2=0.24%, 20=0.02%
cpu : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=741MiB/s (778MB/s), 741MiB/s-741MiB/s (778MB/s-778MB/s), io=1024MiB (1074MB), run=1381-1381msec

Disk stats (read/write):
nvme0n1: ios=216966/0, merge=0/0, ticks=6112/0, in_queue=704, util=91.34%

A write up of the larger problem follows (co-written with Jérôme Glisse):

Overview
========

Some kernel components (file systems, device drivers) need to access
memory that is specified via process virtual address. For a long time, the
API to achieve that was get_user_pages ("GUP") and its variations. However,
GUP has critical limitations that have been overlooked; in particular, GUP
does not interact correctly with filesystems in all situations. That means
that file-backed memory + GUP is a recipe for potential problems, some of
which have already occurred in the field.

GUP was first introduced for Direct IO (O_DIRECT), allowing filesystem code
to get the struct page behind a virtual address and to let storage hardware
perform a direct copy to or from that page. This is a short-lived access
pattern, and as such, the window for a concurrent writeback of GUP'd page
was small enough that there were not (we think) any reported problems.
Also, userspace was expected to understand and accept that Direct IO was
not synchronized with memory-mapped access to that data, nor with any
process address space changes such as munmap(), mremap(), etc.

Over the years, more GUP uses have appeared (virtualization, device
drivers, RDMA) that can keep the pages they get via GUP for a long period
of time (seconds, minutes, hours, days, ...). This long-term pinning makes
an underlying design problem more obvious.

In fact, there are a number of key problems inherent to GUP:

Interactions with file systems
==============================

File systems expect to be able to write back data, both to reclaim pages,
and for data integrity. Allowing other hardware (NICs, GPUs, etc) to gain
write access to the file memory pages means that such hardware can dirty the
pages, without the filesystem being aware. This can, in some cases
(depending on filesystem, filesystem options, block device, block device
options, and other variables), lead to data corruption, and also to kernel
bugs of the form:

kernel BUG at /build/linux-fQ94TU/linux-4.4.0/fs/ext4/inode.c:1899!
backtrace:
ext4_writepage
__writepage
write_cache_pages
ext4_writepages
do_writepages
__writeback_single_inode
writeback_sb_inodes
__writeback_inodes_wb
wb_writeback
wb_workfn
process_one_work
worker_thread
kthread
ret_from_fork

...which is due to the file system asserting that there are still buffer
heads attached:

({ \
BUG_ON(!PagePrivate(page)); \
((struct buffer_head *)page_private(page)); \
})

Dave Chinner's description of this is very clear:

"The fundamental issue is that ->page_mkwrite must be called on every
write access to a clean file backed page, not just the first one.
How long the GUP reference lasts is irrelevant, if the page is clean
and you need to dirty it, you must call ->page_mkwrite before it is
marked writeable and dirtied. Every. Time."

This is just one symptom of the larger design problem: filesystems do not
actually support get_user_pages() being called on their pages, and letting
hardware write directly to those pages--even though that pattern has been
going on since about 2005 or so.

Long term GUP
=============

Long term GUP is an issue when FOLL_WRITE is specified to GUP (so, a
writeable mapping is created), and the pages are file-backed. That can lead
to filesystem corruption. What happens is that when a file-backed page is
being written back, it is first mapped read-only in all of the CPU page
tables; the file system then assumes that nobody can write to the page, and
that the page content is therefore stable. Unfortunately, the GUP callers
generally do not monitor changes to the CPU pages tables; they instead
assume that the following pattern is safe (it's not):

get_user_pages()

Hardware can keep a reference to those pages for a very long time,
and write to it at any time. Because "hardware" here means "devices
that are not a CPU", this activity occurs without any interaction
with the kernel's file system code.

for each page
set_page_dirty
put_page()

In fact, the GUP documentation even recommends that pattern.

Anyway, the file system assumes that the page is stable (nothing is writing
to the page), and that is a problem: stable page content is necessary for
many filesystem actions during writeback, such as checksum, encryption,
RAID striping, etc. Furthermore, filesystem features like COW (copy on
write) or snapshot also rely on being able to use a new page for as memory
for that memory range inside the file.

Corruption during write back is clearly possible here. To solve that, one
idea is to identify pages that have active GUP, so that we can use a bounce
page to write stable data to the filesystem. The filesystem would work
on the bounce page, while any of the active GUP might write to the
original page. This would avoid the stable page violation problem, but note
that it is only part of the overall solution, because other problems
remain.

Other filesystem features that need to replace the page with a new one can
be inhibited for pages that are GUP-pinned. This will, however, alter and
limit some of those filesystem features. The only fix for that would be to
require GUP users monitor and respond to CPU page table updates. Subsystems
such as ODP and HMM do this, for example. This aspect of the problem is
still under discussion.

Direct IO
=========

Direct IO can cause corruption, if userspace does Direct-IO that writes to
a range of virtual addresses that are mmap'd to a file. The pages written
to are file-backed pages that can be under write back, while the Direct IO
is taking place. Here, Direct IO need races with a write back: it calls
GUP before page_mkclean() has replaced the CPU pte with a read-only entry.
The race window is pretty small, which is probably why years have gone by
before we noticed this problem: Direct IO is generally very quick, and
tends to finish up before the filesystem gets around to do anything with
the page contents. However, it's still a real problem. The solution is
to never let GUP return pages that are under write back, but instead,
force GUP to take a write fault on those pages. That way, GUP will
properly synchronize with the active write back. This does not change the
required GUP behavior, it just avoids that race.

What this patchset does
=======================

This patchset overloads page->_refcount, in order to track GUP-pinned
pages.

This patchset checks if the page is under write back, and if so, it backs
off and forces a page fault (via the GUP slow path). Before this patchset,
GUP might have returned the struct page because page_mkclean() had not yet
updated the CPU page table. After this patch, GUP no longer race with
page_mkclean() and thus any user of GUP properly synchronize on active
write back (this is not only useful to direct-IO, but also to other users
of GUP).

This patchset does not include any of the filesystem changes needed to
fix the issues. That is left as a separate patchset that will use the
new flag.


Changes from earlier versions
=============================

-- Fixed up kerneldoc issues in put_user_page*() functions, in response
to Mike Rapoport's review.

-- Use overloaded page->_refcount to track gup-pinned pages. This avoids the
need for an extra page flag, and also avoids the need for an extra counting
field.

[1] [email protected]:johnhubbard/linux.git (branch: gup_dma_core)
[2] https://lwn.net/Articles/753027/ "The trouble with get_user_pages()"

Suggested-by: Jan Kara <[email protected]>
Suggested-by: Jérôme Glisse <[email protected]>

Cc: Christian Benvenuti <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Mike Marciniszyn <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Tom Talpey <[email protected]>


John Hubbard (6):
mm: introduce put_user_page*(), placeholder versions
infiniband/mm: convert put_page() to put_user_page*()
mm: page_cache_add_speculative(): refactoring
mm/gup: track gup-pinned pages
mm/gup: /proc/vmstat support for get/put user pages
mm/gup: Documentation/vm/get_user_pages.rst, MAINTAINERS

Documentation/vm/get_user_pages.rst | 197 ++++++++++++++++++++
Documentation/vm/index.rst | 1 +
MAINTAINERS | 10 +
drivers/infiniband/core/umem.c | 7 +-
drivers/infiniband/core/umem_odp.c | 2 +-
drivers/infiniband/hw/hfi1/user_pages.c | 11 +-
drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +-
drivers/infiniband/hw/qib/qib_user_pages.c | 11 +-
drivers/infiniband/hw/qib/qib_user_sdma.c | 6 +-
drivers/infiniband/hw/usnic/usnic_uiom.c | 7 +-
include/linux/mm.h | 57 ++++++
include/linux/mmzone.h | 5 +
include/linux/pagemap.h | 36 ++--
mm/gup.c | 80 ++++++--
mm/swap.c | 104 +++++++++++
mm/vmstat.c | 5 +
16 files changed, 482 insertions(+), 63 deletions(-)
create mode 100644 Documentation/vm/get_user_pages.rst

--
2.20.1



2019-02-04 05:22:42

by John Hubbard

[permalink] [raw]
Subject: [PATCH 3/6] mm: page_cache_add_speculative(): refactoring

From: John Hubbard <[email protected]>

This combines the common elements of these routines:

page_cache_get_speculative()
page_cache_add_speculative()

This was anticipated by the original author, as shown by the comment
in commit ce0ad7f095258 ("powerpc/mm: Lockless get_user_pages_fast()
for 64-bit (v3)"):

"Same as above, but add instead of inc (could just be merged)"

An upcoming patch for get_user_pages() tracking will use these routines,
so let's remove the duplication now.

There is no intention to introduce any behavioral change, but there is a
small risk of that, due to slightly differing ways of expressing the
TINY_RCU and related configurations.

Cc: Nick Piggin <[email protected]>
Cc: Dave Kleikamp <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
include/linux/pagemap.h | 33 +++++++++++----------------------
1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e2d7039af6a3..5c8a9b59cbdc 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -164,8 +164,10 @@ void release_pages(struct page **pages, int nr);
* will find the page or it will not. Likewise, the old find_get_page could run
* either before the insertion or afterwards, depending on timing.
*/
-static inline int page_cache_get_speculative(struct page *page)
+static inline int __page_cache_add_speculative(struct page *page, int count)
{
+ VM_BUG_ON(in_interrupt());
+
#ifdef CONFIG_TINY_RCU
# ifdef CONFIG_PREEMPT_COUNT
VM_BUG_ON(!in_atomic() && !irqs_disabled());
@@ -180,10 +182,10 @@ static inline int page_cache_get_speculative(struct page *page)
* SMP requires.
*/
VM_BUG_ON_PAGE(page_count(page) == 0, page);
- page_ref_inc(page);
+ page_ref_add(page, count);

#else
- if (unlikely(!get_page_unless_zero(page))) {
+ if (unlikely(!page_ref_add_unless(page, count, 0))) {
/*
* Either the page has been freed, or will be freed.
* In either case, retry here and the caller should
@@ -197,27 +199,14 @@ static inline int page_cache_get_speculative(struct page *page)
return 1;
}

-/*
- * Same as above, but add instead of inc (could just be merged)
- */
-static inline int page_cache_add_speculative(struct page *page, int count)
+static inline int page_cache_get_speculative(struct page *page)
{
- VM_BUG_ON(in_interrupt());
-
-#if !defined(CONFIG_SMP) && defined(CONFIG_TREE_RCU)
-# ifdef CONFIG_PREEMPT_COUNT
- VM_BUG_ON(!in_atomic() && !irqs_disabled());
-# endif
- VM_BUG_ON_PAGE(page_count(page) == 0, page);
- page_ref_add(page, count);
-
-#else
- if (unlikely(!page_ref_add_unless(page, count, 0)))
- return 0;
-#endif
- VM_BUG_ON_PAGE(PageCompound(page) && page != compound_head(page), page);
+ return __page_cache_add_speculative(page, 1);
+}

- return 1;
+static inline int page_cache_add_speculative(struct page *page, int count)
+{
+ return __page_cache_add_speculative(page, count);
}

#ifdef CONFIG_NUMA
--
2.20.1


2019-02-04 05:22:43

by John Hubbard

[permalink] [raw]
Subject: [PATCH 4/6] mm/gup: track gup-pinned pages

From: John Hubbard <[email protected]>

Now that all callers of get_user_pages*() have been updated to use
put_user_page(), instead of put_page(), add tracking of such
"gup-pinned" pages. The purpose of this tracking is to answer the
question "has this page been pinned by a call to get_user_pages()?"

In order to answer that, refcounting is required. get_user_pages() and all
its variants increment a reference count, and put_user_page() and its
variants decrement that reference count. If the net count is *effectively*
non-zero (see below), then the page is considered gup-pinned.

What to do in response to encountering such a page, is left to later
patchsets. There is discussion about this in [1], and in an upcoming patch
that adds:

Documentation/vm/get_user_pages.rst

So, this patch simply adds tracking of such pages. In order to achieve
this without using up any more bits or fields in struct page, the
page->_refcount field is overloaded. gup pins are incremented by adding a
large chunk (1024) instead of 1. This provides a way to say, "either this
page is gup-pinned, or you have a *lot* of references on it, and thus this
is a false positive". False positives are generally OK, as long as they
are expected to be rare: taking action for a page that looks gup-pinned,
but is not, is not going to be a problem. It's false negatives (failing
to detect a gup-pinned page) that would be a problem, and those won't
happen with this approach.

This takes advantage of two distinct, pre-existing lock-free algorithms:

a) get_user_pages() and things such as page_mkclean(), both operate on
page table entries, without taking locks. This relies partly on just
letting the CPU hardware (which of course also never takes locks to
use its own page tables) just take page faults if something has changed.

b) page_cache_get_speculative(), called by get_user_pages(), is a way to
avoid having pages get freed out from under get_user_pages() or other
things that want to pin pages.

As a result, performance is expected to be unchanged in any noticeable
way, by this patch.

In order to test this, a lot of get_user_pages() call sites have to be
converted over to use put_user_page(), but I did that locally, and here
is an fio run on an NVMe drive, using this for the fio configuration file:

[reader]
direct=1
ioengine=libaio
blocksize=4096
size=1g
numjobs=1
rw=read
iodepth=64

reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B,
(T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
Jobs: 1 (f=1)
reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb 3 20:36:51 2019
read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec)
slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10
clat (usec): min=20, max=12485, avg=332.63, stdev=191.77
lat (usec): min=22, max=12498, avg=336.72, stdev=192.07
clat percentiles (usec):
| 1.00th=[ 322], 5.00th=[ 322], 10.00th=[ 322], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 330], 90.00th=[ 330], 95.00th=[ 330],
| 99.00th=[ 478], 99.50th=[ 717], 99.90th=[ 1074], 99.95th=[ 1090],
| 99.99th=[12256]
bw ( KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00,
stdev=32781.47, samples=2
iops : min=182538, max=194128, avg=188333.00, stdev=8195.37,
samples=2
lat (usec) : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38%
lat (usec) : 1000=0.02%
lat (msec) : 2=0.24%, 20=0.02%
cpu : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
>=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
>=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=741MiB/s (778MB/s), 741MiB/s-741MiB/s (778MB/s-778MB/s),
io=1024MiB (1074MB), run=1381-1381msec

Disk stats (read/write):
nvme0n1: ios=216966/0, merge=0/0, ticks=6112/0, in_queue=704, util=91.34%

[1] https://lwn.net/Articles/753027/ "The trouble with get_user_pages()"

Suggested-by: Jan Kara <[email protected]>
Suggested-by: Jérôme Glisse <[email protected]>

Cc: Christian Benvenuti <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Cc: Doug Ledford <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Mike Marciniszyn <[email protected]>
Cc: Ralph Campbell <[email protected]>
Cc: Tom Talpey <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
include/linux/mm.h | 81 +++++++++++++++++++++++++++++------------
include/linux/pagemap.h | 5 +++
mm/gup.c | 60 ++++++++++++++++++++++--------
mm/swap.c | 21 +++++++++++
4 files changed, 128 insertions(+), 39 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 809b7397d41e..dcb01cf0a9de 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -965,6 +965,63 @@ static inline bool is_pci_p2pdma_page(const struct page *page)
}
#endif /* CONFIG_DEV_PAGEMAP_OPS */

+/*
+ * GUP_PIN_COUNTING_BIAS, and the associated functions that use it, overload
+ * the page's refcount so that two separate items are tracked: the original page
+ * reference count, and also a new count of how many get_user_pages() calls were
+ * made against the page. ("gup-pinned" is another term for the latter).
+ *
+ * With this scheme, get_user_pages() becomes special: such pages are marked
+ * as distinct from normal pages. As such, the new put_user_page() call (and
+ * its variants) must be used in order to release gup-pinned pages.
+ *
+ * Choice of value:
+ *
+ * By making GUP_PIN_COUNTING_BIAS a power of two, debugging of page reference
+ * counts with respect to get_user_pages() and put_user_page() becomes simpler,
+ * due to the fact that adding an even power of two to the page refcount has
+ * the effect of using only the upper N bits, for the code that counts up using
+ * the bias value. This means that the lower bits are left for the exclusive
+ * use of the original code that increments and decrements by one (or at least,
+ * by much smaller values than the bias value).
+ *
+ * Of course, once the lower bits overflow into the upper bits (and this is
+ * OK, because subtraction recovers the original values), then visual inspection
+ * no longer suffices to directly view the separate counts. However, for normal
+ * applications that don't have huge page reference counts, this won't be an
+ * issue.
+ *
+ * This has to work on 32-bit as well as 64-bit systems. In the more constrained
+ * 32-bit systems, the 10 bit value of the bias value leaves 22 bits for the
+ * upper bits. Therefore, only about 4M calls to get_user_page() may occur for
+ * a page.
+ *
+ * Locking: the lockless algorithm described in page_cache_gup_pin_speculative()
+ * and page_cache_gup_pin_speculative() provides safe operation for
+ * get_user_pages and page_mkclean and other calls that race to set up page
+ * table entries.
+ */
+#define GUP_PIN_COUNTING_BIAS (1UL << 10)
+
+int get_gup_pin_page(struct page *page);
+
+void put_user_page(struct page *page);
+void put_user_pages_dirty(struct page **pages, unsigned long npages);
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+void put_user_pages(struct page **pages, unsigned long npages);
+
+/**
+ * page_gup_pinned() - report if a page is gup-pinned (pinned by a call to
+ * get_user_pages).
+ * @page: pointer to page to be queried.
+ * @Returns: True, if it is likely that the page has been "gup-pinned".
+ * False, if the page is definitely not gup-pinned.
+ */
+static inline bool page_gup_pinned(struct page *page)
+{
+ return (page_ref_count(page)) > GUP_PIN_COUNTING_BIAS;
+}
+
static inline void get_page(struct page *page)
{
page = compound_head(page);
@@ -993,30 +1050,6 @@ static inline void put_page(struct page *page)
__put_page(page);
}

-/**
- * put_user_page() - release a gup-pinned page
- * @page: pointer to page to be released
- *
- * Pages that were pinned via get_user_pages*() must be released via
- * either put_user_page(), or one of the put_user_pages*() routines
- * below. This is so that eventually, pages that are pinned via
- * get_user_pages*() can be separately tracked and uniquely handled. In
- * particular, interactions with RDMA and filesystems need special
- * handling.
- *
- * put_user_page() and put_page() are not interchangeable, despite this early
- * implementation that makes them look the same. put_user_page() calls must
- * be perfectly matched up with get_user_page() calls.
- */
-static inline void put_user_page(struct page *page)
-{
- put_page(page);
-}
-
-void put_user_pages_dirty(struct page **pages, unsigned long npages);
-void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
-void put_user_pages(struct page **pages, unsigned long npages);
-
#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
#define SECTION_IN_PAGE_FLAGS
#endif
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 5c8a9b59cbdc..5f5b72ba595f 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -209,6 +209,11 @@ static inline int page_cache_add_speculative(struct page *page, int count)
return __page_cache_add_speculative(page, count);
}

+static inline int page_cache_gup_pin_speculative(struct page *page)
+{
+ return __page_cache_add_speculative(page, GUP_PIN_COUNTING_BIAS);
+}
+
#ifdef CONFIG_NUMA
extern struct page *__page_cache_alloc(gfp_t gfp);
#else
diff --git a/mm/gup.c b/mm/gup.c
index 05acd7e2eb22..3291da342f9c 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -25,6 +25,26 @@ struct follow_page_context {
unsigned int page_mask;
};

+/**
+ * get_gup_pin_page() - mark a page as being used by get_user_pages().
+ * @page: pointer to page to be marked
+ * @Returns: 0 for success, -EOVERFLOW if the page refcount would have
+ * overflowed.
+ *
+ */
+int get_gup_pin_page(struct page *page)
+{
+ page = compound_head(page);
+
+ if (page_ref_count(page) >= (UINT_MAX - GUP_PIN_COUNTING_BIAS)) {
+ WARN_ONCE(1, "get_user_pages pin count overflowed");
+ return -EOVERFLOW;
+ }
+
+ page_ref_add(page, GUP_PIN_COUNTING_BIAS);
+ return 0;
+}
+
static struct page *no_page_table(struct vm_area_struct *vma,
unsigned int flags)
{
@@ -157,8 +177,14 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
goto retry;
}

- if (flags & FOLL_GET)
- get_page(page);
+ if (flags & FOLL_GET) {
+ int ret = get_gup_pin_page(page);
+
+ if (ret) {
+ page = ERR_PTR(ret);
+ goto out;
+ }
+ }
if (flags & FOLL_TOUCH) {
if ((flags & FOLL_WRITE) &&
!pte_dirty(pte) && !PageDirty(page))
@@ -497,7 +523,10 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
if (is_device_public_page(*page))
goto unmap;
}
- get_page(*page);
+
+ ret = get_gup_pin_page(*page);
+ if (ret)
+ goto unmap;
out:
ret = 0;
unmap:
@@ -1429,11 +1458,11 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
page = pte_page(pte);
head = compound_head(page);

- if (!page_cache_get_speculative(head))
+ if (!page_cache_gup_pin_speculative(head))
goto pte_unmap;

if (unlikely(pte_val(pte) != pte_val(*ptep))) {
- put_page(head);
+ put_user_page(head);
goto pte_unmap;
}

@@ -1488,7 +1517,11 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
}
SetPageReferenced(page);
pages[*nr] = page;
- get_page(page);
+ if (get_gup_pin_page(page)) {
+ undo_dev_pagemap(nr, nr_start, pages);
+ return 0;
+ }
+
(*nr)++;
pfn++;
} while (addr += PAGE_SIZE, addr != end);
@@ -1569,15 +1602,14 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
} while (addr += PAGE_SIZE, addr != end);

head = compound_head(pmd_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ if (!page_cache_gup_pin_speculative(head)) {
*nr -= refs;
return 0;
}

if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
*nr -= refs;
- while (refs--)
- put_page(head);
+ put_user_page(head);
return 0;
}

@@ -1607,15 +1639,14 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
} while (addr += PAGE_SIZE, addr != end);

head = compound_head(pud_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ if (!page_cache_gup_pin_speculative(head)) {
*nr -= refs;
return 0;
}

if (unlikely(pud_val(orig) != pud_val(*pudp))) {
*nr -= refs;
- while (refs--)
- put_page(head);
+ put_user_page(head);
return 0;
}

@@ -1644,15 +1675,14 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
} while (addr += PAGE_SIZE, addr != end);

head = compound_head(pgd_page(orig));
- if (!page_cache_add_speculative(head, refs)) {
+ if (!page_cache_gup_pin_speculative(head)) {
*nr -= refs;
return 0;
}

if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) {
*nr -= refs;
- while (refs--)
- put_page(head);
+ put_user_page(head);
return 0;
}

diff --git a/mm/swap.c b/mm/swap.c
index 7c42ca45bb89..39b0ddd35933 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -133,6 +133,27 @@ void put_pages_list(struct list_head *pages)
}
EXPORT_SYMBOL(put_pages_list);

+/**
+ * put_user_page() - release a gup-pinned page
+ * @page: pointer to page to be released
+ *
+ * Pages that were pinned via get_user_pages*() must be released via
+ * either put_user_page(), or one of the put_user_pages*() routines
+ * below. This is so that eventually, pages that are pinned via
+ * get_user_pages*() can be separately tracked and uniquely handled. In
+ * particular, interactions with RDMA and filesystems need special
+ * handling.
+ */
+void put_user_page(struct page *page)
+{
+ page = compound_head(page);
+
+ VM_BUG_ON_PAGE(page_ref_count(page) < GUP_PIN_COUNTING_BIAS, page);
+
+ page_ref_sub(page, GUP_PIN_COUNTING_BIAS);
+}
+EXPORT_SYMBOL(put_user_page);
+
typedef int (*set_dirty_func)(struct page *page);

static void __put_user_pages_dirty(struct page **pages,
--
2.20.1


2019-02-04 05:22:43

by John Hubbard

[permalink] [raw]
Subject: [PATCH 1/6] mm: introduce put_user_page*(), placeholder versions

From: John Hubbard <[email protected]>

Introduces put_user_page(), which simply calls put_page().
This provides a way to update all get_user_pages*() callers,
so that they call put_user_page(), instead of put_page().

Also introduces put_user_pages(), and a few dirty/locked variations,
as a replacement for release_pages(), and also as a replacement
for open-coded loops that release multiple pages.
These may be used for subsequent performance improvements,
via batching of pages to be released.

This is the first step of fixing a problem (also described in [1] and
[2]) with interactions between get_user_pages ("gup") and filesystems.

Problem description: let's start with a bug report. Below, is what happens
sometimes, under memory pressure, when a driver pins some pages via gup,
and then marks those pages dirty, and releases them. Note that the gup
documentation actually recommends that pattern. The problem is that the
filesystem may do a writeback while the pages were gup-pinned, and then the
filesystem believes that the pages are clean. So, when the driver later
marks the pages as dirty, that conflicts with the filesystem's page
tracking and results in a BUG(), like this one that I experienced:

kernel BUG at /build/linux-fQ94TU/linux-4.4.0/fs/ext4/inode.c:1899!
backtrace:
ext4_writepage
__writepage
write_cache_pages
ext4_writepages
do_writepages
__writeback_single_inode
writeback_sb_inodes
__writeback_inodes_wb
wb_writeback
wb_workfn
process_one_work
worker_thread
kthread
ret_from_fork

...which is due to the file system asserting that there are still buffer
heads attached:

({ \
BUG_ON(!PagePrivate(page)); \
((struct buffer_head *)page_private(page)); \
})

Dave Chinner's description of this is very clear:

"The fundamental issue is that ->page_mkwrite must be called on every
write access to a clean file backed page, not just the first one.
How long the GUP reference lasts is irrelevant, if the page is clean
and you need to dirty it, you must call ->page_mkwrite before it is
marked writeable and dirtied. Every. Time."

This is just one symptom of the larger design problem: filesystems do not
actually support get_user_pages() being called on their pages, and letting
hardware write directly to those pages--even though that patter has been
going on since about 2005 or so.

The steps are to fix it are:

1) (This patch): provide put_user_page*() routines, intended to be used
for releasing pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to
invoke put_user_page*(), instead of put_page(). This involves dozens of
call sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
implement tracking of these pages. This tracking will be separate from
the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
special handling (especially in writeback paths) when the pages are
backed by a filesystem.

[1] https://lwn.net/Articles/774411/ : "DMA and get_user_pages()"
[2] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Christopher Lameter <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Ralph Campbell <[email protected]>

Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
include/linux/mm.h | 24 ++++++++++++++
mm/swap.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 106 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb6408fe73..809b7397d41e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -993,6 +993,30 @@ static inline void put_page(struct page *page)
__put_page(page);
}

+/**
+ * put_user_page() - release a gup-pinned page
+ * @page: pointer to page to be released
+ *
+ * Pages that were pinned via get_user_pages*() must be released via
+ * either put_user_page(), or one of the put_user_pages*() routines
+ * below. This is so that eventually, pages that are pinned via
+ * get_user_pages*() can be separately tracked and uniquely handled. In
+ * particular, interactions with RDMA and filesystems need special
+ * handling.
+ *
+ * put_user_page() and put_page() are not interchangeable, despite this early
+ * implementation that makes them look the same. put_user_page() calls must
+ * be perfectly matched up with get_user_page() calls.
+ */
+static inline void put_user_page(struct page *page)
+{
+ put_page(page);
+}
+
+void put_user_pages_dirty(struct page **pages, unsigned long npages);
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
+void put_user_pages(struct page **pages, unsigned long npages);
+
#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
#define SECTION_IN_PAGE_FLAGS
#endif
diff --git a/mm/swap.c b/mm/swap.c
index 4929bc1be60e..7c42ca45bb89 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -133,6 +133,88 @@ void put_pages_list(struct list_head *pages)
}
EXPORT_SYMBOL(put_pages_list);

+typedef int (*set_dirty_func)(struct page *page);
+
+static void __put_user_pages_dirty(struct page **pages,
+ unsigned long npages,
+ set_dirty_func sdf)
+{
+ unsigned long index;
+
+ for (index = 0; index < npages; index++) {
+ struct page *page = compound_head(pages[index]);
+
+ if (!PageDirty(page))
+ sdf(page);
+
+ put_user_page(page);
+ }
+}
+
+/**
+ * put_user_pages_dirty() - release and dirty an array of gup-pinned pages
+ * @pages: array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ * "gup-pinned page" refers to a page that has had one of the get_user_pages()
+ * variants called on that page.
+ *
+ * For each page in the @pages array, make that page (or its head page, if a
+ * compound page) dirty, if it was previously listed as clean. Then, release
+ * the page using put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * set_page_dirty(), which does not lock the page, is used here.
+ * Therefore, it is the caller's responsibility to ensure that this is
+ * safe. If not, then put_user_pages_dirty_lock() should be called instead.
+ *
+ */
+void put_user_pages_dirty(struct page **pages, unsigned long npages)
+{
+ __put_user_pages_dirty(pages, npages, set_page_dirty);
+}
+EXPORT_SYMBOL(put_user_pages_dirty);
+
+/**
+ * put_user_pages_dirty_lock() - release and dirty an array of gup-pinned pages
+ * @pages: array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ * For each page in the @pages array, make that page (or its head page, if a
+ * compound page) dirty, if it was previously listed as clean. Then, release
+ * the page using put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ *
+ * This is just like put_user_pages_dirty(), except that it invokes
+ * set_page_dirty_lock(), instead of set_page_dirty().
+ *
+ */
+void put_user_pages_dirty_lock(struct page **pages, unsigned long npages)
+{
+ __put_user_pages_dirty(pages, npages, set_page_dirty_lock);
+}
+EXPORT_SYMBOL(put_user_pages_dirty_lock);
+
+/**
+ * put_user_pages() - release an array of gup-pinned pages.
+ * @pages: array of pages to be marked dirty and released.
+ * @npages: number of pages in the @pages array.
+ *
+ * For each page in the @pages array, release the page using put_user_page().
+ *
+ * Please see the put_user_page() documentation for details.
+ */
+void put_user_pages(struct page **pages, unsigned long npages)
+{
+ unsigned long index;
+
+ for (index = 0; index < npages; index++)
+ put_user_page(pages[index]);
+}
+EXPORT_SYMBOL(put_user_pages);
+
/*
* get_kernel_pages() - pin kernel pages in memory
* @kiov: An array of struct kvec structures
--
2.20.1


2019-02-04 05:22:43

by John Hubbard

[permalink] [raw]
Subject: [PATCH 6/6] mm/gup: Documentation/vm/get_user_pages.rst, MAINTAINERS

From: John Hubbard <[email protected]>

1. Added Documentation/vm/get_user_pages.rst

2. Added a GET_USER_PAGES entry in MAINTAINERS

Cc: Dan Williams <[email protected]>
Cc: Jan Kara <[email protected]>
Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
Documentation/vm/get_user_pages.rst | 197 ++++++++++++++++++++++++++++
Documentation/vm/index.rst | 1 +
MAINTAINERS | 10 ++
3 files changed, 208 insertions(+)
create mode 100644 Documentation/vm/get_user_pages.rst

diff --git a/Documentation/vm/get_user_pages.rst b/Documentation/vm/get_user_pages.rst
new file mode 100644
index 000000000000..8598f20afb09
--- /dev/null
+++ b/Documentation/vm/get_user_pages.rst
@@ -0,0 +1,197 @@
+.. _get_user_pages:
+
+==============
+get_user_pages
+==============
+
+.. contents:: :local:
+
+Overview
+========
+
+Some kernel components (file systems, device drivers) need to access
+memory that is specified via process virtual address. For a long time, the
+API to achieve that was get_user_pages ("GUP") and its variations. However,
+GUP has critical limitations that have been overlooked; in particular, GUP
+does not interact correctly with filesystems in all situations. That means
+that file-backed memory + GUP is a recipe for potential problems, some of
+which have already occurred in the field.
+
+GUP was first introduced for Direct IO (O_DIRECT), allowing filesystem code
+to get the struct page behind a virtual address and to let storage hardware
+perform a direct copy to or from that page. This is a short-lived access
+pattern, and as such, the window for a concurrent writeback of GUP'd page
+was small enough that there were not (we think) any reported problems.
+Also, userspace was expected to understand and accept that Direct IO was
+not synchronized with memory-mapped access to that data, nor with any
+process address space changes such as munmap(), mremap(), etc.
+
+Over the years, more GUP uses have appeared (virtualization, device
+drivers, RDMA) that can keep the pages they get via GUP for a long period
+of time (seconds, minutes, hours, days, ...). This long-term pinning makes
+an underlying design problem more obvious.
+
+In fact, there are a number of key problems inherent to GUP:
+
+Interactions with file systems
+==============================
+
+File systems expect to be able to write back data, both to reclaim pages,
+and for data integrity. Allowing other hardware (NICs, GPUs, etc) to gain
+write access to the file memory pages means that such hardware can dirty the
+pages, without the filesystem being aware. This can, in some cases
+(depending on filesystem, filesystem options, block device, block device
+options, and other variables), lead to data corruption, and also to kernel
+bugs of the form:
+
+::
+
+ kernel BUG at /build/linux-fQ94TU/linux-4.4.0/fs/ext4/inode.c:1899!
+ backtrace:
+
+ ext4_writepage
+ __writepage
+ write_cache_pages
+ ext4_writepages
+ do_writepages
+ __writeback_single_inode
+ writeback_sb_inodes
+ __writeback_inodes_wb
+ wb_writeback
+ wb_workfn
+ process_one_work
+ worker_thread
+ kthread
+ ret_from_fork
+
+...which is due to the file system asserting that there are still buffer
+heads attached:
+
+::
+
+ /* If we *know* page->private refers to buffer_heads */
+ #define page_buffers(page) \
+ ({ \
+ BUG_ON(!PagePrivate(page)); \
+ ((struct buffer_head *)page_private(page)); \
+ })
+ #define page_has_buffers(page) PagePrivate(page)
+
+Dave Chinner's description of this is very clear:
+
+ "The fundamental issue is that ->page_mkwrite must be called on every
+ write access to a clean file backed page, not just the first one.
+ How long the GUP reference lasts is irrelevant, if the page is clean
+ and you need to dirty it, you must call ->page_mkwrite before it is
+ marked writeable and dirtied. Every. Time."
+
+This is just one symptom of the larger design problem: filesystems do not
+actually support get_user_pages() being called on their pages, and letting
+hardware write directly to those pages--even though that pattern has been
+going on since about 2005 or so.
+
+Long term GUP
+=============
+
+Long term GUP is an issue when FOLL_WRITE is specified to GUP (so, a
+writeable mapping is created), and the pages are file-backed. That can lead
+to filesystem corruption. What happens is that when a file-backed page is
+being written back, it is first mapped read-only in all of the CPU page
+tables; the file system then assumes that nobody can write to the page, and
+that the page content is therefore stable. Unfortunately, the GUP callers
+generally do not monitor changes to the CPU pages tables; they instead
+assume that the following pattern is safe (it's not):
+
+::
+
+ get_user_pages()
+
+ Hardware then keeps a reference to those pages for some potentially
+ long time. During this time, hardware may write to the pages. Because
+ "hardware" here means "devices that are not a CPU", this activity
+ occurs without any interaction with the kernel's file system code.
+
+ for each page:
+ set_page_dirty()
+ put_page()
+
+In fact, the GUP documentation even recommends that pattern.
+
+Anyway, the file system assumes that the page is stable (nothing is writing
+to the page), and that is a problem: stable page content is necessary for
+many filesystem actions during writeback, such as checksum, encryption,
+RAID striping, etc. Furthermore, filesystem features like COW (copy on
+write) or snapshot also rely on being able to use a new page for as memory
+for that memory range inside the file.
+
+Corruption during write back is clearly possible here. To solve that, one
+idea is to identify pages that have active GUP, so that we can use a bounce
+page to write stable data to the filesystem. The filesystem would work
+on the bounce page, while any of the active GUP might write to the
+original page. This would avoid the stable page violation problem, but note
+that it is only part of the overall solution, because other problems
+remain.
+
+Other filesystem features that need to replace the page with a new one can
+be inhibited for pages that are GUP-pinned. This will, however, alter and
+limit some of those filesystem features. The only fix for that would be to
+require GUP users monitor and respond to CPU page table updates. Subsystems
+such as ODP and HMM do this, for example. This aspect of the problem is
+still under discussion.
+
+Direct IO
+=========
+
+Direct IO can cause corruption, if userspace does Direct-IO that writes to
+a range of virtual addresses that are mmap'd to a file. The pages written
+to are file-backed pages that can be under write back, while the Direct IO
+is taking place. Here, Direct IO need races with a write back: it calls
+GUP before page_mkclean() has replaced the CPU pte with a read-only entry.
+The race window is pretty small, which is probably why years have gone by
+before we noticed this problem: Direct IO is generally very quick, and
+tends to finish up before the filesystem gets around to do anything with
+the page contents. However, it's still a real problem. The solution is
+to never let GUP return pages that are under write back, but instead,
+force GUP to take a write fault on those pages. That way, GUP will
+properly synchronize with the active write back. This does not change the
+required GUP behavior, it just avoids that race.
+
+Measurement and visibility
+==========================
+
+There are several /proc/vmstat items, in order to provide some visibility
+into what get_user_pages() and put_user_page() are doing.
+
+After booting and running fio (https://github.com/axboe/fio)
+a few times on an NVMe device, as a way to get lots of
+get_user_pages_fast() calls, the counters look like this:
+
+::
+
+ $ cat /proc/vmstat | grep gup
+ nr_gup_slow_pages_requested 21319
+ nr_gup_fast_pages_requested 11533792
+ nr_gup_fast_page_backoffs 0
+ nr_gup_page_count_overflows 0
+ nr_gup_pages_returned 11555104
+
+Interpretation of the above:
+
+::
+
+ Total gup requests (slow + fast): 11555111
+ Total put_user_page calls: 11555104
+
+This shows 7 more calls to get_user_pages(), than to put_user_page().
+That may, or may not, represent a problem worth investigating.
+
+Normally, those last two numbers should be equal, but a couple of things
+may cause them to differ:
+
+1. Inherent race condition in reading /proc/vmstat values.
+
+2. Bugs at any of the get_user_pages*() call sites. Those
+sites need to match get_user_pages() and put_user_page() calls.
+
+
+
diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst
index 2b3ab3a1ccf3..433aaf1996e6 100644
--- a/Documentation/vm/index.rst
+++ b/Documentation/vm/index.rst
@@ -32,6 +32,7 @@ descriptions of data structures and algorithms.
balance
cleancache
frontswap
+ get_user_pages
highmem
hmm
hwpoison
diff --git a/MAINTAINERS b/MAINTAINERS
index 8c68de3cfd80..1e8f91b8ce4f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6384,6 +6384,16 @@ M: Frank Haverkamp <[email protected]>
S: Supported
F: drivers/misc/genwqe/

+GET_USER_PAGES
+M: Dan Williams <[email protected]>
+M: Jan Kara <[email protected]>
+M: Jérôme Glisse <[email protected]>
+M: John Hubbard <[email protected]>
+L: [email protected]
+S: Maintained
+F: mm/gup.c
+F: Documentation/vm/get_user_pages.rst
+
GET_MAINTAINER SCRIPT
M: Joe Perches <[email protected]>
S: Maintained
--
2.20.1


2019-02-04 05:24:14

by John Hubbard

[permalink] [raw]
Subject: [PATCH 5/6] mm/gup: /proc/vmstat support for get/put user pages

From: John Hubbard <[email protected]>

Add five new /proc/vmstat items, to provide some
visibility into what get_user_pages() and put_user_page()
are doing.

After booting and running fio (https://github.com/axboe/fio)
a few times on an NVMe device, as a way to get lots of
get_user_pages_fast() calls, the counters look like this:

$ cat /proc/vmstat |grep gup
nr_gup_slow_pages_requested 21319
nr_gup_fast_pages_requested 11533792
nr_gup_fast_page_backoffs 0
nr_gup_page_count_overflows 0
nr_gup_pages_returned 11555104

Interpretation of the above:
Total gup requests (slow + fast): 11555111
Total put_user_page calls: 11555104

This shows 7 more calls to get_user_pages(), than to
put_user_page(). That may, or may not, represent a
problem worth investigating.

Normally, those last two numbers should be equal, but a
couple of things may cause them to differ:

1) Inherent race condition in reading /proc/vmstat values.

2) Bugs at any of the get_user_pages*() call sites. Those
sites need to match get_user_pages() and put_user_page() calls.

Signed-off-by: John Hubbard <[email protected]>
---
include/linux/mmzone.h | 5 +++++
mm/gup.c | 20 ++++++++++++++++++++
mm/swap.c | 1 +
mm/vmstat.c | 5 +++++
4 files changed, 31 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 842f9189537b..f20c14958a2b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -183,6 +183,11 @@ enum node_stat_item {
NR_DIRTIED, /* page dirtyings since bootup */
NR_WRITTEN, /* page writings since bootup */
NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */
+ NR_GUP_SLOW_PAGES_REQUESTED, /* via: get_user_pages() */
+ NR_GUP_FAST_PAGES_REQUESTED, /* via: get_user_pages_fast() */
+ NR_GUP_FAST_PAGE_BACKOFFS, /* gup_fast() lost to page_mkclean() */
+ NR_GUP_PAGE_COUNT_OVERFLOWS, /* gup count overflowed: gup() failed */
+ NR_GUP_PAGES_RETURNED, /* via: put_user_page() */
NR_VM_NODE_STAT_ITEMS
};

diff --git a/mm/gup.c b/mm/gup.c
index 3291da342f9c..848ee7899831 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -37,6 +37,8 @@ int get_gup_pin_page(struct page *page)
page = compound_head(page);

if (page_ref_count(page) >= (UINT_MAX - GUP_PIN_COUNTING_BIAS)) {
+ mod_node_page_state(page_pgdat(page),
+ NR_GUP_PAGE_COUNT_OVERFLOWS, 1);
WARN_ONCE(1, "get_user_pages pin count overflowed");
return -EOVERFLOW;
}
@@ -184,6 +186,8 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
page = ERR_PTR(ret);
goto out;
}
+ mod_node_page_state(page_pgdat(page),
+ NR_GUP_SLOW_PAGES_REQUESTED, 1);
}
if (flags & FOLL_TOUCH) {
if ((flags & FOLL_WRITE) &&
@@ -527,6 +531,8 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
ret = get_gup_pin_page(*page);
if (ret)
goto unmap;
+
+ mod_node_page_state(page_pgdat(*page), NR_GUP_SLOW_PAGES_REQUESTED, 1);
out:
ret = 0;
unmap:
@@ -1461,7 +1467,12 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
if (!page_cache_gup_pin_speculative(head))
goto pte_unmap;

+ mod_node_page_state(page_pgdat(head),
+ NR_GUP_FAST_PAGES_REQUESTED, 1);
+
if (unlikely(pte_val(pte) != pte_val(*ptep))) {
+ mod_node_page_state(page_pgdat(head),
+ NR_GUP_FAST_PAGE_BACKOFFS, 1);
put_user_page(head);
goto pte_unmap;
}
@@ -1522,6 +1533,9 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
return 0;
}

+ mod_node_page_state(page_pgdat(page),
+ NR_GUP_FAST_PAGES_REQUESTED, 1);
+
(*nr)++;
pfn++;
} while (addr += PAGE_SIZE, addr != end);
@@ -1607,6 +1621,8 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
return 0;
}

+ mod_node_page_state(page_pgdat(head), NR_GUP_FAST_PAGES_REQUESTED, 1);
+
if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
*nr -= refs;
put_user_page(head);
@@ -1644,6 +1660,8 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
return 0;
}

+ mod_node_page_state(page_pgdat(head), NR_GUP_FAST_PAGES_REQUESTED, 1);
+
if (unlikely(pud_val(orig) != pud_val(*pudp))) {
*nr -= refs;
put_user_page(head);
@@ -1680,6 +1698,8 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
return 0;
}

+ mod_node_page_state(page_pgdat(head), NR_GUP_FAST_PAGES_REQUESTED, 1);
+
if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) {
*nr -= refs;
put_user_page(head);
diff --git a/mm/swap.c b/mm/swap.c
index 39b0ddd35933..49e192f242d4 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -150,6 +150,7 @@ void put_user_page(struct page *page)

VM_BUG_ON_PAGE(page_ref_count(page) < GUP_PIN_COUNTING_BIAS, page);

+ mod_node_page_state(page_pgdat(page), NR_GUP_PAGES_RETURNED, 1);
page_ref_sub(page, GUP_PIN_COUNTING_BIAS);
}
EXPORT_SYMBOL(put_user_page);
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 83b30edc2f7f..18a1a4a2dd29 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1164,6 +1164,11 @@ const char * const vmstat_text[] = {
"nr_dirtied",
"nr_written",
"nr_kernel_misc_reclaimable",
+ "nr_gup_slow_pages_requested",
+ "nr_gup_fast_pages_requested",
+ "nr_gup_fast_page_backoffs",
+ "nr_gup_page_count_overflows",
+ "nr_gup_pages_returned",

/* enum writeback_stat_item counters */
"nr_dirty_threshold",
--
2.20.1


2019-02-04 05:24:51

by John Hubbard

[permalink] [raw]
Subject: [PATCH 2/6] infiniband/mm: convert put_page() to put_user_page*()

From: John Hubbard <[email protected]>

For infiniband code that retains pages via get_user_pages*(),
release those pages via the new put_user_page(), or
put_user_pages*(), instead of put_page()

This is a tiny part of the second step of fixing the problem described
in [1]. The steps are:

1) Provide put_user_page*() routines, intended to be used
for releasing pages that were pinned via get_user_pages*().

2) Convert all of the call sites for get_user_pages*(), to
invoke put_user_page*(), instead of put_page(). This involves dozens of
call sites, and will take some time.

3) After (2) is complete, use get_user_pages*() and put_user_page*() to
implement tracking of these pages. This tracking will be separate from
the existing struct page refcounting.

4) Use the tracking and identification of these pages, to implement
special handling (especially in writeback paths) when the pages are
backed by a filesystem. Again, [1] provides details as to why that is
desirable.

[1] https://lwn.net/Articles/753027/ : "The Trouble with get_user_pages()"

Cc: Doug Ledford <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Mike Marciniszyn <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Cc: Christian Benvenuti <[email protected]>

Reviewed-by: Jan Kara <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Acked-by: Jason Gunthorpe <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
drivers/infiniband/core/umem.c | 7 ++++---
drivers/infiniband/core/umem_odp.c | 2 +-
drivers/infiniband/hw/hfi1/user_pages.c | 11 ++++-------
drivers/infiniband/hw/mthca/mthca_memfree.c | 6 +++---
drivers/infiniband/hw/qib/qib_user_pages.c | 11 ++++-------
drivers/infiniband/hw/qib/qib_user_sdma.c | 6 +++---
drivers/infiniband/hw/usnic/usnic_uiom.c | 7 ++++---
7 files changed, 23 insertions(+), 27 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c6144df47ea4..c2898bc7b3b2 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -58,9 +58,10 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int d
for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) {

page = sg_page(sg);
- if (!PageDirty(page) && umem->writable && dirty)
- set_page_dirty_lock(page);
- put_page(page);
+ if (umem->writable && dirty)
+ put_user_pages_dirty_lock(&page, 1);
+ else
+ put_user_page(page);
}

sg_free_table(&umem->sg_head);
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index acb882f279cb..d32757c1f77e 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -663,7 +663,7 @@ int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt,
ret = -EFAULT;
break;
}
- put_page(local_page_list[j]);
+ put_user_page(local_page_list[j]);
continue;
}

diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index e341e6dcc388..99ccc0483711 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -121,13 +121,10 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
size_t npages, bool dirty)
{
- size_t i;
-
- for (i = 0; i < npages; i++) {
- if (dirty)
- set_page_dirty_lock(p[i]);
- put_page(p[i]);
- }
+ if (dirty)
+ put_user_pages_dirty_lock(p, npages);
+ else
+ put_user_pages(p, npages);

if (mm) { /* during close after signal, mm can be NULL */
down_write(&mm->mmap_sem);
diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c b/drivers/infiniband/hw/mthca/mthca_memfree.c
index 112d2f38e0de..99108f3dcf01 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -481,7 +481,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,

ret = pci_map_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
if (ret < 0) {
- put_page(pages[0]);
+ put_user_page(pages[0]);
goto out;
}

@@ -489,7 +489,7 @@ int mthca_map_user_db(struct mthca_dev *dev, struct mthca_uar *uar,
mthca_uarc_virt(dev, uar, i));
if (ret) {
pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
- put_page(sg_page(&db_tab->page[i].mem));
+ put_user_page(sg_page(&db_tab->page[i].mem));
goto out;
}

@@ -555,7 +555,7 @@ void mthca_cleanup_user_db_tab(struct mthca_dev *dev, struct mthca_uar *uar,
if (db_tab->page[i].uvirt) {
mthca_UNMAP_ICM(dev, mthca_uarc_virt(dev, uar, i), 1);
pci_unmap_sg(dev->pdev, &db_tab->page[i].mem, 1, PCI_DMA_TODEVICE);
- put_page(sg_page(&db_tab->page[i].mem));
+ put_user_page(sg_page(&db_tab->page[i].mem));
}
}

diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c b/drivers/infiniband/hw/qib/qib_user_pages.c
index 16543d5e80c3..1a5c64c8695f 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -40,13 +40,10 @@
static void __qib_release_user_pages(struct page **p, size_t num_pages,
int dirty)
{
- size_t i;
-
- for (i = 0; i < num_pages; i++) {
- if (dirty)
- set_page_dirty_lock(p[i]);
- put_page(p[i]);
- }
+ if (dirty)
+ put_user_pages_dirty_lock(p, num_pages);
+ else
+ put_user_pages(p, num_pages);
}

/*
diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 31c523b2a9f5..a1a1ec4adffc 100644
--- a/drivers/infiniband/hw/qib/qib_user_sdma.c
+++ b/drivers/infiniband/hw/qib/qib_user_sdma.c
@@ -320,7 +320,7 @@ static int qib_user_sdma_page_to_frags(const struct qib_devdata *dd,
* the caller can ignore this page.
*/
if (put) {
- put_page(page);
+ put_user_page(page);
} else {
/* coalesce case */
kunmap(page);
@@ -634,7 +634,7 @@ static void qib_user_sdma_free_pkt_frag(struct device *dev,
kunmap(pkt->addr[i].page);

if (pkt->addr[i].put_page)
- put_page(pkt->addr[i].page);
+ put_user_page(pkt->addr[i].page);
else
__free_page(pkt->addr[i].page);
} else if (pkt->addr[i].kvaddr) {
@@ -709,7 +709,7 @@ static int qib_user_sdma_pin_pages(const struct qib_devdata *dd,
/* if error, return all pages not managed by pkt */
free_pages:
while (i < j)
- put_page(pages[i++]);
+ put_user_page(pages[i++]);

done:
return ret;
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 49275a548751..2ef8d31dc838 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -77,9 +77,10 @@ static void usnic_uiom_put_pages(struct list_head *chunk_list, int dirty)
for_each_sg(chunk->page_list, sg, chunk->nents, i) {
page = sg_page(sg);
pa = sg_phys(sg);
- if (!PageDirty(page) && dirty)
- set_page_dirty_lock(page);
- put_page(page);
+ if (dirty)
+ put_user_pages_dirty_lock(&page, 1);
+ else
+ put_user_page(page);
usnic_dbg("pa: %pa\n", &pa);
}
kfree(chunk);
--
2.20.1


Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Sun, 3 Feb 2019, [email protected] wrote:

> Some kernel components (file systems, device drivers) need to access
> memory that is specified via process virtual address. For a long time, the
> API to achieve that was get_user_pages ("GUP") and its variations. However,
> GUP has critical limitations that have been overlooked; in particular, GUP
> does not interact correctly with filesystems in all situations. That means
> that file-backed memory + GUP is a recipe for potential problems, some of
> which have already occurred in the field.

It may be worth noting a couple of times in this text that this was
designed for anonymous memory and that such use is/was ok. We are talking
about a use case here using mmapped access with a regular filesystem that
was not initially intended. The mmapping of from the hugepages filesystem
is special in that it is not a device that is actually writing things
back.

Any use with a filesystem that actually writes data back to a medium
is something that is broken.



2019-02-04 17:20:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, Feb 04, 2019 at 04:08:02PM +0000, Christopher Lameter wrote:
> It may be worth noting a couple of times in this text that this was
> designed for anonymous memory and that such use is/was ok. We are talking
> about a use case here using mmapped access with a regular filesystem that
> was not initially intended. The mmapping of from the hugepages filesystem
> is special in that it is not a device that is actually writing things
> back.
>
> Any use with a filesystem that actually writes data back to a medium
> is something that is broken.

Saying it was not intended seems rather odd, as it was supported
since day 0 and people made use of it.

Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, 4 Feb 2019, Christoph Hellwig wrote:

> On Mon, Feb 04, 2019 at 04:08:02PM +0000, Christopher Lameter wrote:
> > It may be worth noting a couple of times in this text that this was
> > designed for anonymous memory and that such use is/was ok. We are talking
> > about a use case here using mmapped access with a regular filesystem that
> > was not initially intended. The mmapping of from the hugepages filesystem
> > is special in that it is not a device that is actually writing things
> > back.
> >
> > Any use with a filesystem that actually writes data back to a medium
> > is something that is broken.
>
> Saying it was not intended seems rather odd, as it was supported
> since day 0 and people made use of it.

Well until last year I never thought there was a problem because I
considered it separate from regular filesystem I/O.





Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

Frankly I still think this does not solve anything.

Concurrent write access from two sources to a single page is simply wrong.
You cannot make this right by allowing long term RDMA pins in a filesystem
and thus the filesystem can never update part of its files on disk.

Can we just disable RDMA to regular filesystems? Regular filesystems
should have full control of the write back and dirty status of their
pages.

Special filesystems that do not actually do write back (like hugetlbfs),
mmaped raw device files and anonymous allocations are fine.


2019-02-04 17:53:19

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, Feb 04, 2019 at 05:14:19PM +0000, Christopher Lameter wrote:
> Frankly I still think this does not solve anything.
>
> Concurrent write access from two sources to a single page is simply wrong.
> You cannot make this right by allowing long term RDMA pins in a filesystem
> and thus the filesystem can never update part of its files on disk.

Fundamentally this patch series is fixing O_DIRECT to not crash the
kernel in extreme cases.. RDMA has the same problem, but it is much
easier to hit.

I think questions related to RDMA are somewhat separate, and maybe it
should be blocked, or not, but either way O_DIRECT has to be fixed.

Jason

2019-02-04 18:20:37

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 4/6] mm/gup: track gup-pinned pages

On Sun, Feb 03, 2019 at 09:21:33PM -0800, [email protected] wrote:
> +/*
> + * GUP_PIN_COUNTING_BIAS, and the associated functions that use it, overload
> + * the page's refcount so that two separate items are tracked: the original page
> + * reference count, and also a new count of how many get_user_pages() calls were
> + * made against the page. ("gup-pinned" is another term for the latter).
> + *
> + * With this scheme, get_user_pages() becomes special: such pages are marked
> + * as distinct from normal pages. As such, the new put_user_page() call (and
> + * its variants) must be used in order to release gup-pinned pages.
> + *
> + * Choice of value:
> + *
> + * By making GUP_PIN_COUNTING_BIAS a power of two, debugging of page reference
> + * counts with respect to get_user_pages() and put_user_page() becomes simpler,
> + * due to the fact that adding an even power of two to the page refcount has
> + * the effect of using only the upper N bits, for the code that counts up using
> + * the bias value. This means that the lower bits are left for the exclusive
> + * use of the original code that increments and decrements by one (or at least,
> + * by much smaller values than the bias value).
> + *
> + * Of course, once the lower bits overflow into the upper bits (and this is
> + * OK, because subtraction recovers the original values), then visual inspection
> + * no longer suffices to directly view the separate counts. However, for normal
> + * applications that don't have huge page reference counts, this won't be an
> + * issue.
> + *
> + * This has to work on 32-bit as well as 64-bit systems. In the more constrained
> + * 32-bit systems, the 10 bit value of the bias value leaves 22 bits for the
> + * upper bits. Therefore, only about 4M calls to get_user_page() may occur for
> + * a page.

The refcount is 32-bit on both 64 and 32 bit systems. This limit
exists on both sizes of system.

Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, 4 Feb 2019, Jason Gunthorpe wrote:

> On Mon, Feb 04, 2019 at 05:14:19PM +0000, Christopher Lameter wrote:
> > Frankly I still think this does not solve anything.
> >
> > Concurrent write access from two sources to a single page is simply wrong.
> > You cannot make this right by allowing long term RDMA pins in a filesystem
> > and thus the filesystem can never update part of its files on disk.
>
> Fundamentally this patch series is fixing O_DIRECT to not crash the
> kernel in extreme cases.. RDMA has the same problem, but it is much
> easier to hit.

O_DIRECT is the same issue. O_DIRECT addresses always have been in
anonymous memory or special file systems.

2019-02-04 19:10:13

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, Feb 04, 2019 at 06:21:39PM +0000, Christopher Lameter wrote:
> On Mon, 4 Feb 2019, Jason Gunthorpe wrote:
>
> > On Mon, Feb 04, 2019 at 05:14:19PM +0000, Christopher Lameter wrote:
> > > Frankly I still think this does not solve anything.
> > >
> > > Concurrent write access from two sources to a single page is simply wrong.
> > > You cannot make this right by allowing long term RDMA pins in a filesystem
> > > and thus the filesystem can never update part of its files on disk.
> >
> > Fundamentally this patch series is fixing O_DIRECT to not crash the
> > kernel in extreme cases.. RDMA has the same problem, but it is much
> > easier to hit.
>
> O_DIRECT is the same issue. O_DIRECT addresses always have been in
> anonymous memory or special file systems.

That's never been a constraint that's existed.

2019-02-04 19:13:39

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 4/6] mm/gup: track gup-pinned pages

On 2/4/19 10:19 AM, Matthew Wilcox wrote:
> On Sun, Feb 03, 2019 at 09:21:33PM -0800, [email protected] wrote:
>> +/*
>> + * GUP_PIN_COUNTING_BIAS, and the associated functions that use it, overload
>> + * the page's refcount so that two separate items are tracked: the original page
>> + * reference count, and also a new count of how many get_user_pages() calls were
>> + * made against the page. ("gup-pinned" is another term for the latter).
>> + *
>> + * With this scheme, get_user_pages() becomes special: such pages are marked
>> + * as distinct from normal pages. As such, the new put_user_page() call (and
>> + * its variants) must be used in order to release gup-pinned pages.
>> + *
>> + * Choice of value:
>> + *
>> + * By making GUP_PIN_COUNTING_BIAS a power of two, debugging of page reference
>> + * counts with respect to get_user_pages() and put_user_page() becomes simpler,
>> + * due to the fact that adding an even power of two to the page refcount has
>> + * the effect of using only the upper N bits, for the code that counts up using
>> + * the bias value. This means that the lower bits are left for the exclusive
>> + * use of the original code that increments and decrements by one (or at least,
>> + * by much smaller values than the bias value).
>> + *
>> + * Of course, once the lower bits overflow into the upper bits (and this is
>> + * OK, because subtraction recovers the original values), then visual inspection
>> + * no longer suffices to directly view the separate counts. However, for normal
>> + * applications that don't have huge page reference counts, this won't be an
>> + * issue.
>> + *
>> + * This has to work on 32-bit as well as 64-bit systems. In the more constrained
>> + * 32-bit systems, the 10 bit value of the bias value leaves 22 bits for the
>> + * upper bits. Therefore, only about 4M calls to get_user_page() may occur for
>> + * a page.
>
> The refcount is 32-bit on both 64 and 32 bit systems. This limit
> exists on both sizes of system.
>

Oh right, I'll just delete that last paragraph, then. Thanks for catching that.


thanks,
--
John Hubbard
NVIDIA

2019-02-04 23:35:57

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, Feb 04, 2019 at 05:14:19PM +0000, Christopher Lameter wrote:
> Frankly I still think this does not solve anything.
>
> Concurrent write access from two sources to a single page is simply wrong.
> You cannot make this right by allowing long term RDMA pins in a filesystem
> and thus the filesystem can never update part of its files on disk.
>
> Can we just disable RDMA to regular filesystems? Regular filesystems
> should have full control of the write back and dirty status of their
> pages.

That may be a solution to the corruption/crashes but it is not a solution which
users want to see. RDMA directly to file systems (specifically DAX) is a use
case we have seen customers ask for.

I think this is the correct path toward supporting this use case.

Ira


2019-02-05 01:49:46

by Tom Talpey

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On 2/4/2019 12:21 AM, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
>
> Performance: here is an fio run on an NVMe drive, using this for the fio
> configuration file:
>
> [reader]
> direct=1
> ioengine=libaio
> blocksize=4096
> size=1g
> numjobs=1
> rw=read
> iodepth=64
>
> reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
> fio-3.3
> Starting 1 process
> Jobs: 1 (f=1)
> reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb 3 20:36:51 2019
> read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec)
> slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10
> clat (usec): min=20, max=12485, avg=332.63, stdev=191.77
> lat (usec): min=22, max=12498, avg=336.72, stdev=192.07
> clat percentiles (usec):
> | 1.00th=[ 322], 5.00th=[ 322], 10.00th=[ 322], 20.00th=[ 326],
> | 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
> | 70.00th=[ 326], 80.00th=[ 330], 90.00th=[ 330], 95.00th=[ 330],
> | 99.00th=[ 478], 99.50th=[ 717], 99.90th=[ 1074], 99.95th=[ 1090],
> | 99.99th=[12256]

These latencies are concerning. The best results we saw at the end of
November (previous approach) were MUCH flatter. These really start
spiking at three 9's, and are sky-high at four 9's. The "stdev" values
for clat and lat are about 10 times the previous. There's some kind
of serious queuing contention here, that wasn't there in November.

> bw ( KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00, stdev=32781.47, samples=2
> iops : min=182538, max=194128, avg=188333.00, stdev=8195.37, samples=2
> lat (usec) : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38%
> lat (usec) : 1000=0.02%
> lat (msec) : 2=0.24%, 20=0.02%
> cpu : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74

System CPU 84% is roughly double the November results of 45%. Ouch.

Did you re-run the baseline on the new unpatched base kernel and can
we see the before/after?

Tom.

> IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
> complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
> issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
> latency : target=0, window=0, percentile=100.00%, depth=64
>
> Run status group 0 (all jobs):
> READ: bw=741MiB/s (778MB/s), 741MiB/s-741MiB/s (778MB/s-778MB/s), io=1024MiB (1074MB), run=1381-1381msec
>
> Disk stats (read/write):
> nvme0n1: ios=216966/0, merge=0/0, ticks=6112/0, in_queue=704, util=91.34%

2019-02-05 08:33:09

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On 2/4/19 5:41 PM, Tom Talpey wrote:
> On 2/4/2019 12:21 AM, [email protected] wrote:
>> From: John Hubbard <[email protected]>
>>
>>
>> Performance: here is an fio run on an NVMe drive, using this for the fio
>> configuration file:
>>
>>      [reader]
>>      direct=1
>>      ioengine=libaio
>>      blocksize=4096
>>      size=1g
>>      numjobs=1
>>      rw=read
>>      iodepth=64
>>
>> reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
>> 4096B-4096B, ioengine=libaio, iodepth=64
>> fio-3.3
>> Starting 1 process
>> Jobs: 1 (f=1)
>> reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb  3 20:36:51 2019
>>     read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec)
>>      slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10
>>      clat (usec): min=20, max=12485, avg=332.63, stdev=191.77
>>       lat (usec): min=22, max=12498, avg=336.72, stdev=192.07
>>      clat percentiles (usec):
>>       |  1.00th=[  322],  5.00th=[  322], 10.00th=[  322], 20.00th=[
>> 326],
>>       | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[
>> 326],
>>       | 70.00th=[  326], 80.00th=[  330], 90.00th=[  330], 95.00th=[
>> 330],
>>       | 99.00th=[  478], 99.50th=[  717], 99.90th=[ 1074], 99.95th=[
>> 1090],
>>       | 99.99th=[12256]
>
> These latencies are concerning. The best results we saw at the end of
> November (previous approach) were MUCH flatter. These really start
> spiking at three 9's, and are sky-high at four 9's. The "stdev" values
> for clat and lat are about 10 times the previous. There's some kind
> of serious queuing contention here, that wasn't there in November.

Hi Tom,

I think this latency problem is also there in the baseline kernel, but...

>
>>     bw (  KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00,
>> stdev=32781.47, samples=2
>>     iops        : min=182538, max=194128, avg=188333.00,
>> stdev=8195.37, samples=2
>>    lat (usec)   : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38%
>>    lat (usec)   : 1000=0.02%
>>    lat (msec)   : 2=0.24%, 20=0.02%
>>    cpu          : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74
>
> System CPU 84% is roughly double the November results of 45%. Ouch.

That's my fault. First of all, I had a few extra, supposedly minor debug
settings in the .config, which I'm removing now--I'm doing a proper run
with the original .config file from November, below. Second, I'm not
sure I controlled the run carefully enough.

>
> Did you re-run the baseline on the new unpatched base kernel and can
> we see the before/after?

Doing that now, I see:

-- No significant perf difference between before and after, but
-- Still high clat in the 99.99th

=======================================================================
Before: using commit 8834f5600cf3 ("Linux 5.0-rc5")
===================================================
reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
Jobs: 1 (f=1)
reader: (groupid=0, jobs=1): err= 0: pid=1829: Tue Feb 5 00:08:08 2019
read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1359msec)
slat (nsec): min=1269, max=40309, avg=1493.66, stdev=534.83
clat (usec): min=127, max=12249, avg=329.83, stdev=184.92
lat (usec): min=129, max=12256, avg=331.35, stdev=185.06
clat percentiles (usec):
| 1.00th=[ 326], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326],
| 99.00th=[ 347], 99.50th=[ 519], 99.90th=[ 529], 99.95th=[ 537],
| 99.99th=[12125]
bw ( KiB/s): min=755032, max=781472, per=99.57%, avg=768252.00,
stdev=18695.90, samples=2
iops : min=188758, max=195368, avg=192063.00, stdev=4673.98,
samples=2
lat (usec) : 250=0.08%, 500=99.18%, 750=0.72%
lat (msec) : 20=0.02%
cpu : usr=12.30%, sys=46.83%, ctx=253554, majf=0, minf=74
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
>=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
>=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s),
io=1024MiB (1074MB), run=1359-1359msec

Disk stats (read/write):
nvme0n1: ios=221246/0, merge=0/0, ticks=71556/0, in_queue=704,
util=91.35%

=======================================================================
After:
=======================================================================
reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
4096B-4096B, ioengine=libaio, iodepth=64
fio-3.3
Starting 1 process
Jobs: 1 (f=1)
reader: (groupid=0, jobs=1): err= 0: pid=1803: Mon Feb 4 23:58:07 2019
read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1359msec)
slat (nsec): min=1276, max=41900, avg=1505.36, stdev=565.26
clat (usec): min=177, max=12186, avg=329.88, stdev=184.03
lat (usec): min=178, max=12192, avg=331.42, stdev=184.16
clat percentiles (usec):
| 1.00th=[ 326], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326],
| 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326],
| 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326],
| 99.00th=[ 359], 99.50th=[ 498], 99.90th=[ 537], 99.95th=[ 627],
| 99.99th=[12125]
bw ( KiB/s): min=754656, max=781504, per=99.55%, avg=768080.00,
stdev=18984.40, samples=2
iops : min=188664, max=195378, avg=192021.00, stdev=4747.51,
samples=2
lat (usec) : 250=0.12%, 500=99.40%, 750=0.46%
lat (msec) : 20=0.02%
cpu : usr=12.44%, sys=47.05%, ctx=252127, majf=0, minf=73
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
>=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
>=64=0.0%
issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s),
io=1024MiB (1074MB), run=1359-1359msec

Disk stats (read/write):
nvme0n1: ios=221203/0, merge=0/0, ticks=71291/0, in_queue=704,
util=91.19%

How's this look to you?

thanks,
--
John Hubbard
NVIDIA

2019-02-05 13:47:02

by Tom Talpey

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On 2/5/2019 3:22 AM, John Hubbard wrote:
> On 2/4/19 5:41 PM, Tom Talpey wrote:
>> On 2/4/2019 12:21 AM, [email protected] wrote:
>>> From: John Hubbard <[email protected]>
>>>
>>>
>>> Performance: here is an fio run on an NVMe drive, using this for the fio
>>> configuration file:
>>>
>>>      [reader]
>>>      direct=1
>>>      ioengine=libaio
>>>      blocksize=4096
>>>      size=1g
>>>      numjobs=1
>>>      rw=read
>>>      iodepth=64
>>>
>>> reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
>>> 4096B-4096B, ioengine=libaio, iodepth=64
>>> fio-3.3
>>> Starting 1 process
>>> Jobs: 1 (f=1)
>>> reader: (groupid=0, jobs=1): err= 0: pid=7011: Sun Feb  3 20:36:51 2019
>>>     read: IOPS=190k, BW=741MiB/s (778MB/s)(1024MiB/1381msec)
>>>      slat (nsec): min=2716, max=57255, avg=4048.14, stdev=1084.10
>>>      clat (usec): min=20, max=12485, avg=332.63, stdev=191.77
>>>       lat (usec): min=22, max=12498, avg=336.72, stdev=192.07
>>>      clat percentiles (usec):
>>>       |  1.00th=[  322],  5.00th=[  322], 10.00th=[  322], 20.00th=[
>>> 326],
>>>       | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[
>>> 326],
>>>       | 70.00th=[  326], 80.00th=[  330], 90.00th=[  330], 95.00th=[
>>> 330],
>>>       | 99.00th=[  478], 99.50th=[  717], 99.90th=[ 1074], 99.95th=[
>>> 1090],
>>>       | 99.99th=[12256]
>>
>> These latencies are concerning. The best results we saw at the end of
>> November (previous approach) were MUCH flatter. These really start
>> spiking at three 9's, and are sky-high at four 9's. The "stdev" values
>> for clat and lat are about 10 times the previous. There's some kind
>> of serious queuing contention here, that wasn't there in November.
>
> Hi Tom,
>
> I think this latency problem is also there in the baseline kernel, but...
>
>>
>>>     bw (  KiB/s): min=730152, max=776512, per=99.22%, avg=753332.00,
>>> stdev=32781.47, samples=2
>>>     iops        : min=182538, max=194128, avg=188333.00,
>>> stdev=8195.37, samples=2
>>>    lat (usec)   : 50=0.01%, 100=0.01%, 250=0.07%, 500=99.26%, 750=0.38%
>>>    lat (usec)   : 1000=0.02%
>>>    lat (msec)   : 2=0.24%, 20=0.02%
>>>    cpu          : usr=15.07%, sys=84.13%, ctx=10, majf=0, minf=74
>>
>> System CPU 84% is roughly double the November results of 45%. Ouch.
>
> That's my fault. First of all, I had a few extra, supposedly minor debug
> settings in the .config, which I'm removing now--I'm doing a proper run
> with the original .config file from November, below. Second, I'm not
> sure I controlled the run carefully enough.
>
>>
>> Did you re-run the baseline on the new unpatched base kernel and can
>> we see the before/after?
>
> Doing that now, I see:
>
> -- No significant perf difference between before and after, but
> -- Still high clat in the 99.99th
>
> =======================================================================
> Before: using commit 8834f5600cf3 ("Linux 5.0-rc5")
> ===================================================
> reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=libaio, iodepth=64
> fio-3.3
> Starting 1 process
> Jobs: 1 (f=1)
> reader: (groupid=0, jobs=1): err= 0: pid=1829: Tue Feb  5 00:08:08 2019
>    read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1359msec)
>     slat (nsec): min=1269, max=40309, avg=1493.66, stdev=534.83
>     clat (usec): min=127, max=12249, avg=329.83, stdev=184.92
>      lat (usec): min=129, max=12256, avg=331.35, stdev=185.06
>     clat percentiles (usec):
>      |  1.00th=[  326],  5.00th=[  326], 10.00th=[  326], 20.00th=[  326],
>      | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[  326],
>      | 70.00th=[  326], 80.00th=[  326], 90.00th=[  326], 95.00th=[  326],
>      | 99.00th=[  347], 99.50th=[  519], 99.90th=[  529], 99.95th=[  537],
>      | 99.99th=[12125]
>    bw (  KiB/s): min=755032, max=781472, per=99.57%, avg=768252.00,
> stdev=18695.90, samples=2
>    iops        : min=188758, max=195368, avg=192063.00, stdev=4673.98,
> samples=2
>   lat (usec)   : 250=0.08%, 500=99.18%, 750=0.72%
>   lat (msec)   : 20=0.02%
>   cpu          : usr=12.30%, sys=46.83%, ctx=253554, majf=0, minf=74
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=100.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
>      issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=64
>
> Run status group 0 (all jobs):
>    READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s),
> io=1024MiB (1074MB), run=1359-1359msec
>
> Disk stats (read/write):
>   nvme0n1: ios=221246/0, merge=0/0, ticks=71556/0, in_queue=704,
> util=91.35%
>
> =======================================================================
> After:
> =======================================================================
> reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T)
> 4096B-4096B, ioengine=libaio, iodepth=64
> fio-3.3
> Starting 1 process
> Jobs: 1 (f=1)
> reader: (groupid=0, jobs=1): err= 0: pid=1803: Mon Feb  4 23:58:07 2019
>    read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1359msec)
>     slat (nsec): min=1276, max=41900, avg=1505.36, stdev=565.26
>     clat (usec): min=177, max=12186, avg=329.88, stdev=184.03
>      lat (usec): min=178, max=12192, avg=331.42, stdev=184.16
>     clat percentiles (usec):
>      |  1.00th=[  326],  5.00th=[  326], 10.00th=[  326], 20.00th=[  326],
>      | 30.00th=[  326], 40.00th=[  326], 50.00th=[  326], 60.00th=[  326],
>      | 70.00th=[  326], 80.00th=[  326], 90.00th=[  326], 95.00th=[  326],
>      | 99.00th=[  359], 99.50th=[  498], 99.90th=[  537], 99.95th=[  627],
>      | 99.99th=[12125]
>    bw (  KiB/s): min=754656, max=781504, per=99.55%, avg=768080.00,
> stdev=18984.40, samples=2
>    iops        : min=188664, max=195378, avg=192021.00, stdev=4747.51,
> samples=2
>   lat (usec)   : 250=0.12%, 500=99.40%, 750=0.46%
>   lat (msec)   : 20=0.02%
>   cpu          : usr=12.44%, sys=47.05%, ctx=252127, majf=0, minf=73
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%,
> >=64=100.0%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
>      issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>      latency   : target=0, window=0, percentile=100.00%, depth=64
>
> Run status group 0 (all jobs):
>    READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s),
> io=1024MiB (1074MB), run=1359-1359msec
>
> Disk stats (read/write):
>   nvme0n1: ios=221203/0, merge=0/0, ticks=71291/0, in_queue=704,
> util=91.19%
>
> How's this look to you?

Ok, I'm satisfied the four-9's latency spike is in not your code. :-)
Results look good relative to baseline. Thanks for doublechecking!

Tom.

2019-02-05 16:44:56

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH 6/6] mm/gup: Documentation/vm/get_user_pages.rst, MAINTAINERS

Hi John,

On Sun, Feb 03, 2019 at 09:21:35PM -0800, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> 1. Added Documentation/vm/get_user_pages.rst
>
> 2. Added a GET_USER_PAGES entry in MAINTAINERS
>
> Cc: Dan Williams <[email protected]>
> Cc: Jan Kara <[email protected]>
> Signed-off-by: J?r?me Glisse <[email protected]>
> Signed-off-by: John Hubbard <[email protected]>
> ---
> Documentation/vm/get_user_pages.rst | 197 ++++++++++++++++++++++++++++
> Documentation/vm/index.rst | 1 +
> MAINTAINERS | 10 ++
> 3 files changed, 208 insertions(+)
> create mode 100644 Documentation/vm/get_user_pages.rst
>
> diff --git a/Documentation/vm/get_user_pages.rst b/Documentation/vm/get_user_pages.rst
> new file mode 100644
> index 000000000000..8598f20afb09
> --- /dev/null
> +++ b/Documentation/vm/get_user_pages.rst

It's great to see docs coming alone with the patches! :)

Yet, I'm a bit confused. The documentation here mostly describes the
existing problems that this patchset aims to solve, but the text here does
not describe the proposed solution.

> @@ -0,0 +1,197 @@
> +.. _get_user_pages:
> +
> +==============
> +get_user_pages
> +==============
> +
> +.. contents:: :local:
> +
> +Overview
> +========
> +
> +Some kernel components (file systems, device drivers) need to access
> +memory that is specified via process virtual address. For a long time, the
> +API to achieve that was get_user_pages ("GUP") and its variations. However,
> +GUP has critical limitations that have been overlooked; in particular, GUP
> +does not interact correctly with filesystems in all situations. That means
> +that file-backed memory + GUP is a recipe for potential problems, some of
> +which have already occurred in the field.
> +
> +GUP was first introduced for Direct IO (O_DIRECT), allowing filesystem code
> +to get the struct page behind a virtual address and to let storage hardware
> +perform a direct copy to or from that page. This is a short-lived access
> +pattern, and as such, the window for a concurrent writeback of GUP'd page
> +was small enough that there were not (we think) any reported problems.
> +Also, userspace was expected to understand and accept that Direct IO was
> +not synchronized with memory-mapped access to that data, nor with any
> +process address space changes such as munmap(), mremap(), etc.
> +
> +Over the years, more GUP uses have appeared (virtualization, device
> +drivers, RDMA) that can keep the pages they get via GUP for a long period
> +of time (seconds, minutes, hours, days, ...). This long-term pinning makes
> +an underlying design problem more obvious.
> +
> +In fact, there are a number of key problems inherent to GUP:
> +
> +Interactions with file systems
> +==============================
> +
> +File systems expect to be able to write back data, both to reclaim pages,
> +and for data integrity. Allowing other hardware (NICs, GPUs, etc) to gain
> +write access to the file memory pages means that such hardware can dirty the
> +pages, without the filesystem being aware. This can, in some cases
> +(depending on filesystem, filesystem options, block device, block device
> +options, and other variables), lead to data corruption, and also to kernel
> +bugs of the form:
> +
> +::
> +
> + kernel BUG at /build/linux-fQ94TU/linux-4.4.0/fs/ext4/inode.c:1899!
> + backtrace:
> +
> + ext4_writepage
> + __writepage
> + write_cache_pages
> + ext4_writepages
> + do_writepages
> + __writeback_single_inode
> + writeback_sb_inodes
> + __writeback_inodes_wb
> + wb_writeback
> + wb_workfn
> + process_one_work
> + worker_thread
> + kthread
> + ret_from_fork
> +
> +...which is due to the file system asserting that there are still buffer
> +heads attached:
> +
> +::
> +
> + /* If we *know* page->private refers to buffer_heads */
> + #define page_buffers(page) \
> + ({ \
> + BUG_ON(!PagePrivate(page)); \
> + ((struct buffer_head *)page_private(page)); \
> + })
> + #define page_has_buffers(page) PagePrivate(page)
> +
> +Dave Chinner's description of this is very clear:
> +
> + "The fundamental issue is that ->page_mkwrite must be called on every
> + write access to a clean file backed page, not just the first one.
> + How long the GUP reference lasts is irrelevant, if the page is clean
> + and you need to dirty it, you must call ->page_mkwrite before it is
> + marked writeable and dirtied. Every. Time."
> +
> +This is just one symptom of the larger design problem: filesystems do not
> +actually support get_user_pages() being called on their pages, and letting
> +hardware write directly to those pages--even though that pattern has been
> +going on since about 2005 or so.
> +
> +Long term GUP
> +=============
> +
> +Long term GUP is an issue when FOLL_WRITE is specified to GUP (so, a
> +writeable mapping is created), and the pages are file-backed. That can lead
> +to filesystem corruption. What happens is that when a file-backed page is
> +being written back, it is first mapped read-only in all of the CPU page
> +tables; the file system then assumes that nobody can write to the page, and
> +that the page content is therefore stable. Unfortunately, the GUP callers
> +generally do not monitor changes to the CPU pages tables; they instead
> +assume that the following pattern is safe (it's not):
> +
> +::
> +
> + get_user_pages()
> +
> + Hardware then keeps a reference to those pages for some potentially
> + long time. During this time, hardware may write to the pages. Because
> + "hardware" here means "devices that are not a CPU", this activity
> + occurs without any interaction with the kernel's file system code.
> +
> + for each page:
> + set_page_dirty()
> + put_page()
> +
> +In fact, the GUP documentation even recommends that pattern.
> +
> +Anyway, the file system assumes that the page is stable (nothing is writing
> +to the page), and that is a problem: stable page content is necessary for
> +many filesystem actions during writeback, such as checksum, encryption,
> +RAID striping, etc. Furthermore, filesystem features like COW (copy on
> +write) or snapshot also rely on being able to use a new page for as memory
> +for that memory range inside the file.
> +
> +Corruption during write back is clearly possible here. To solve that, one
> +idea is to identify pages that have active GUP, so that we can use a bounce
> +page to write stable data to the filesystem. The filesystem would work
> +on the bounce page, while any of the active GUP might write to the
> +original page. This would avoid the stable page violation problem, but note
> +that it is only part of the overall solution, because other problems
> +remain.
> +
> +Other filesystem features that need to replace the page with a new one can
> +be inhibited for pages that are GUP-pinned. This will, however, alter and
> +limit some of those filesystem features. The only fix for that would be to
> +require GUP users monitor and respond to CPU page table updates. Subsystems
> +such as ODP and HMM do this, for example. This aspect of the problem is
> +still under discussion.
> +
> +Direct IO
> +=========
> +
> +Direct IO can cause corruption, if userspace does Direct-IO that writes to
> +a range of virtual addresses that are mmap'd to a file. The pages written
> +to are file-backed pages that can be under write back, while the Direct IO
> +is taking place. Here, Direct IO need races with a write back: it calls
> +GUP before page_mkclean() has replaced the CPU pte with a read-only entry.
> +The race window is pretty small, which is probably why years have gone by
> +before we noticed this problem: Direct IO is generally very quick, and
> +tends to finish up before the filesystem gets around to do anything with
> +the page contents. However, it's still a real problem. The solution is
> +to never let GUP return pages that are under write back, but instead,
> +force GUP to take a write fault on those pages. That way, GUP will
> +properly synchronize with the active write back. This does not change the
> +required GUP behavior, it just avoids that race.
> +
> +Measurement and visibility
> +==========================
> +
> +There are several /proc/vmstat items, in order to provide some visibility
> +into what get_user_pages() and put_user_page() are doing.
> +
> +After booting and running fio (https://github.com/axboe/fio)
> +a few times on an NVMe device, as a way to get lots of
> +get_user_pages_fast() calls, the counters look like this:
> +
> +::
> +
> + $ cat /proc/vmstat | grep gup
> + nr_gup_slow_pages_requested 21319
> + nr_gup_fast_pages_requested 11533792
> + nr_gup_fast_page_backoffs 0
> + nr_gup_page_count_overflows 0
> + nr_gup_pages_returned 11555104
> +
> +Interpretation of the above:
> +
> +::
> +
> + Total gup requests (slow + fast): 11555111
> + Total put_user_page calls: 11555104
> +
> +This shows 7 more calls to get_user_pages(), than to put_user_page().
> +That may, or may not, represent a problem worth investigating.
> +
> +Normally, those last two numbers should be equal, but a couple of things
> +may cause them to differ:
> +
> +1. Inherent race condition in reading /proc/vmstat values.
> +
> +2. Bugs at any of the get_user_pages*() call sites. Those
> +sites need to match get_user_pages() and put_user_page() calls.
> +
> +
> +
> diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst
> index 2b3ab3a1ccf3..433aaf1996e6 100644
> --- a/Documentation/vm/index.rst
> +++ b/Documentation/vm/index.rst
> @@ -32,6 +32,7 @@ descriptions of data structures and algorithms.
> balance
> cleancache
> frontswap
> + get_user_pages
> highmem
> hmm
> hwpoison
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8c68de3cfd80..1e8f91b8ce4f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6384,6 +6384,16 @@ M: Frank Haverkamp <[email protected]>
> S: Supported
> F: drivers/misc/genwqe/
>
> +GET_USER_PAGES
> +M: Dan Williams <[email protected]>
> +M: Jan Kara <[email protected]>
> +M: J?r?me Glisse <[email protected]>
> +M: John Hubbard <[email protected]>
> +L: [email protected]
> +S: Maintained
> +F: mm/gup.c
> +F: Documentation/vm/get_user_pages.rst
> +
> GET_MAINTAINER SCRIPT
> M: Joe Perches <[email protected]>
> S: Maintained
> --
> 2.20.1
>

--
Sincerely yours,
Mike.


Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On Mon, 4 Feb 2019, Ira Weiny wrote:

> On Mon, Feb 04, 2019 at 05:14:19PM +0000, Christopher Lameter wrote:
> > Frankly I still think this does not solve anything.
> >
> > Concurrent write access from two sources to a single page is simply wrong.
> > You cannot make this right by allowing long term RDMA pins in a filesystem
> > and thus the filesystem can never update part of its files on disk.
> >
> > Can we just disable RDMA to regular filesystems? Regular filesystems
> > should have full control of the write back and dirty status of their
> > pages.
>
> That may be a solution to the corruption/crashes but it is not a solution which
> users want to see. RDMA directly to file systems (specifically DAX) is a use
> case we have seen customers ask for.

DAX is a special file system that does not use writeback for the DAX
mappings. Thus it could be an exception. And the pages are already pinned.




2019-02-05 22:04:12

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 6/6] mm/gup: Documentation/vm/get_user_pages.rst, MAINTAINERS

On 2/5/19 8:40 AM, Mike Rapoport wrote:
> Hi John,
>
> On Sun, Feb 03, 2019 at 09:21:35PM -0800, [email protected] wrote:
>> From: John Hubbard <[email protected]>
>>
>> 1. Added Documentation/vm/get_user_pages.rst
>>
>> 2. Added a GET_USER_PAGES entry in MAINTAINERS
>>
>> Cc: Dan Williams <[email protected]>
>> Cc: Jan Kara <[email protected]>
>> Signed-off-by: Jérôme Glisse <[email protected]>
>> Signed-off-by: John Hubbard <[email protected]>
>> ---
>> Documentation/vm/get_user_pages.rst | 197 ++++++++++++++++++++++++++++
>> Documentation/vm/index.rst | 1 +
>> MAINTAINERS | 10 ++
>> 3 files changed, 208 insertions(+)
>> create mode 100644 Documentation/vm/get_user_pages.rst
>>
>> diff --git a/Documentation/vm/get_user_pages.rst b/Documentation/vm/get_user_pages.rst
>> new file mode 100644
>> index 000000000000..8598f20afb09
>> --- /dev/null
>> +++ b/Documentation/vm/get_user_pages.rst
>
> It's great to see docs coming alone with the patches! :)
>
> Yet, I'm a bit confused. The documentation here mostly describes the
> existing problems that this patchset aims to solve, but the text here does
> not describe the proposed solution.
>

Yes, that's true. I'll take another pass at it with that in mind.

thanks,
--
John Hubbard
NVIDIA

2019-02-05 22:08:20

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 0/6] RFC v2: mm: gup/dma tracking

On 2/5/19 5:38 AM, Tom Talpey wrote:
>
> Ok, I'm satisfied the four-9's latency spike is in not your code. :-)
> Results look good relative to baseline. Thanks for doublechecking!
>
> Tom.


Great, in that case, I'll put the new before-and-after results in the next
version. Appreciate your help here, as always!

thanks,
--
John Hubbard
NVIDIA

2019-02-11 09:53:27

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [mm/gup] cdaa813278: kvm-unit-tests.vmx_ept_access_test_paddr_read_write.fail

FYI, we noticed the following commit (built with gcc-7):

commit: cdaa813278ddc616ee201eacda77f63996b5dd2d ("[PATCH 4/6] mm/gup: track gup-pinned pages")
url: https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/RFC-v2-mm-gup-dma-tracking/20190205-001101


in testcase: kvm-unit-tests
with following parameters:




on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




ignored_by_lkp: pku
ignored_by_lkp: taskswitch
ignored_by_lkp: taskswitch2
ignored_by_lkp: svm
ignored_by_lkp: hyperv_connections
ignored_by_lkp: vmx
ignored_by_lkp: vmx_controls
ignored_by_lkp: vmx_test_vmx_feature_control
ignored_by_lkp: vmx_instruction_intercept
ignored_by_lkp: vmware_backdoors
2019-02-08 13:10:41 ./run_tests.sh
FAIL apic-split (50 tests, 1 unexpected failures)
PASS ioapic-split (19 tests)
FAIL apic (50 tests, 1 unexpected failures)
PASS ioapic (19 tests)
PASS smptest (1 tests)
PASS smptest3 (1 tests)
PASS vmexit_cpuid
PASS vmexit_vmcall
PASS vmexit_mov_from_cr8
PASS vmexit_mov_to_cr8
PASS vmexit_inl_pmtimer
PASS vmexit_ipi
PASS vmexit_ipi_halt
PASS vmexit_ple_round_robin
PASS vmexit_tscdeadline
PASS vmexit_tscdeadline_immed
PASS access
PASS smap (18 tests)
PASS emulator (125 tests, 2 skipped)
PASS eventinj (13 tests)
PASS hypercall (2 tests)
PASS idt_test (4 tests)
PASS memory (8 tests)
PASS msr (12 tests)
PASS pmu (67 tests)
PASS port80
PASS realmode
PASS s3
PASS sieve
PASS syscall (2 tests)
PASS tsc (3 tests)
PASS tsc_adjust (5 tests)
PASS xsave (17 tests)
PASS rmap_chain
PASS kvmclock_test
PASS pcid (3 tests)
PASS umip (21 tests)
PASS vmx_null (1 tests)
PASS vmx_test_vmxon (4 tests)
PASS vmx_test_vmptrld (5 tests)
PASS vmx_test_vmclear (6 tests)
PASS vmx_test_vmptrst (1 tests)
PASS vmx_test_vmwrite_vmread (1 tests)
PASS vmx_test_vmcs_high (4 tests)
PASS vmx_test_vmcs_lifecycle (9 tests)
PASS vmx_test_vmx_caps (11 tests)
PASS vmx_vmenter (2 tests)
PASS vmx_preemption_timer (5 tests)
PASS vmx_control_field_PAT (3 tests)
PASS vmx_control_field_EFER (3 tests)
PASS vmx_CR_shadowing (12 tests)
PASS vmx_IO_bitmap (15 tests)
PASS vmx_EPT_AD_enabled (17 tests)
PASS vmx_EPT_AD_disabled (19 tests)
PASS vmx_PML (2 tests)
PASS vmx_VPID (3 tests)
PASS vmx_interrupt (8 tests)
PASS vmx_debug_controls (4 tests)
PASS vmx_MSR_switch (4 tests)
PASS vmx_vmmcall (1 tests)
PASS vmx_disable_RDTSCP (2 tests)
PASS vmx_int3 (1 tests)
PASS vmx_into (1 tests)
PASS vmx_exit_monitor_from_l2_test
PASS vmx_v2 (29 tests)
PASS vmx_ept_access_test_not_present (188 tests)
PASS vmx_ept_access_test_read_only (158 tests)
PASS vmx_ept_access_test_write_only (152 tests)
PASS vmx_ept_access_test_read_write (128 tests)
PASS vmx_ept_access_test_execute_only (158 tests)
PASS vmx_ept_access_test_read_execute (128 tests)
PASS vmx_ept_access_test_write_execute (152 tests)
PASS vmx_ept_access_test_read_write_execute (98 tests)
PASS vmx_ept_access_test_reserved_bits (4078 tests)
PASS vmx_ept_access_test_ignored_bits (2888 tests)
PASS vmx_ept_access_test_paddr_not_present_ad_disabled (62 tests)
PASS vmx_ept_access_test_paddr_not_present_ad_enabled (62 tests)
PASS vmx_ept_access_test_paddr_read_only_ad_disabled (141 tests)
PASS vmx_ept_access_test_paddr_read_only_ad_enabled (166 tests)
FAIL vmx_ept_access_test_paddr_read_write (timeout; duration=90s)
FAIL vmx_ept_access_test_paddr_read_write_execute (timeout; duration=90s)
PASS vmx_ept_access_test_paddr_read_execute_ad_disabled (141 tests)
PASS vmx_ept_access_test_paddr_read_execute_ad_enabled (166 tests)
PASS vmx_ept_access_test_paddr_not_present_page_fault (8 tests)
PASS vmx_ept_access_test_force_2m_page (48 tests)
PASS vmx_invvpid (1561 tests)
PASS vmx_vmentry_movss_shadow_test (11 tests)
PASS vmx_cr_load_test (8 tests)
PASS vmx_nm_test (9 tests)
PASS vmx_pending_event_test (12 tests)
PASS vmx_pending_event_hlt_test (12 tests)
PASS vmx_db_test (53 tests, 5 expected failures)
PASS vmx_eoi_bitmap_ioapic_scan (7 tests)
PASS vmx_hlt_with_rvi_test (7 tests)
PASS vmx_apic_passthrough (12 tests)
PASS vmx_apic_passthrough_thread (8 tests)
PASS vmx_vmcs_shadow_test (142218 tests)
PASS debug (11 tests)
PASS hyperv_synic (1 tests)
PASS hyperv_stimer (12 tests)
PASS hyperv_clock
PASS intel_iommu (11 tests)



To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Rong Chen


Attachments:
(No filename) (5.46 kB)
config-5.0.0-rc4-00004-gcdaa8132 (171.78 kB)
job-script (4.89 kB)
dmesg.xz (22.74 kB)
kvm-unit-tests (4.73 kB)
job.yaml (3.98 kB)
reproduce (19.00 B)
Download all attachments

2019-02-18 02:33:49

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [mm/gup] e7ae097b0b: will-it-scale.per_process_ops -5.0% regression

Greeting,

FYI, we noticed a -5.0% regression of will-it-scale.per_process_ops due to commit:


commit: e7ae097b0bda3e7dfd224e2a960346c37aa42394 ("[PATCH 5/6] mm/gup: /proc/vmstat support for get/put user pages")
url: https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/RFC-v2-mm-gup-dma-tracking/20190205-001101


in testcase: will-it-scale
on test machine: 192 threads Skylake-4S with 704G memory
with following parameters:

nr_task: 100%
mode: process
test: futex1
cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-4sp1/futex1/will-it-scale

commit:
cdaa813278 ("mm/gup: track gup-pinned pages")
e7ae097b0b ("mm/gup: /proc/vmstat support for get/put user pages")

cdaa813278ddc616 e7ae097b0bda3e7dfd224e2a96
---------------- --------------------------
%stddev %change %stddev
\ | \
1808394 -5.0% 1717524 will-it-scale.per_process_ops
3.472e+08 -5.0% 3.298e+08 will-it-scale.workload
7936 +3.0% 8174 vmstat.system.cs
546083 ± 30% -41.7% 318102 ± 68% numa-numastat.node2.local_node
576076 ± 27% -40.3% 343781 ± 60% numa-numastat.node2.numa_hit
27067 -4.5% 25852 ± 2% proc-vmstat.nr_shmem
30283 -6.3% 28378 ± 3% proc-vmstat.pgactivate
18458 ± 5% +9.0% 20120 ± 5% slabinfo.kmalloc-96.active_objs
19055 ± 4% +7.7% 20521 ± 5% slabinfo.kmalloc-96.num_objs
1654 ± 5% +13.5% 1877 ± 3% slabinfo.pool_workqueue.active_objs
1747 ± 5% +10.9% 1937 ± 3% slabinfo.pool_workqueue.num_objs
17.13 ± 3% +2.3 19.45 ± 2% perf-profile.calltrace.cycles-pp.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
0.00 +2.4 2.40 ± 9% perf-profile.calltrace.cycles-pp.mod_node_page_state.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake
19.62 ± 3% +2.4 22.06 ± 2% perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
26.50 ± 3% +2.8 29.33 ± 2% perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
32.48 ± 3% +3.0 35.52 ± 2% perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.45 ±100% -67.5% 15.08 ± 22% sched_debug.cfs_rq:/.load_avg.stddev
0.04 ± 5% -11.4% 0.04 ± 6% sched_debug.cfs_rq:/.nr_running.stddev
13.08 ± 4% +18.8% 15.54 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.max
1.14 ± 4% +18.6% 1.35 ± 5% sched_debug.cfs_rq:/.runnable_load_avg.stddev
13543 ± 7% +9.5% 14825 ± 4% sched_debug.cfs_rq:/.runnable_weight.max
1267 ± 5% +13.4% 1436 ± 6% sched_debug.cfs_rq:/.runnable_weight.stddev
1089 ± 3% +14.4% 1246 ± 7% sched_debug.cfs_rq:/.util_avg.max
533.25 ± 7% +7.7% 574.21 ± 5% sched_debug.cfs_rq:/.util_est_enqueued.max
1.17 ± 6% +18.1% 1.39 ± 6% sched_debug.cpu.cpu_load[0].stddev
3321 +9.1% 3622 ± 4% sched_debug.cpu.curr->pid.min
1355 ± 5% +10.1% 1493 ± 6% sched_debug.cpu.load.stddev
1399 ± 6% -11.5% 1239 ± 3% sched_debug.cpu.nr_load_updates.stddev
270.33 ± 8% +52.7% 412.92 ± 14% sched_debug.cpu.nr_switches.min
113.58 ± 3% +75.1% 198.88 ± 6% sched_debug.cpu.sched_count.min
6331 ± 17% +41.5% 8956 ± 9% sched_debug.cpu.sched_goidle.max
545.75 ± 5% +24.9% 681.80 ± 9% sched_debug.cpu.sched_goidle.stddev
89.38 +48.7% 132.92 ± 3% sched_debug.cpu.ttwu_count.min
48.21 ± 2% +82.6% 88.04 ± 6% sched_debug.cpu.ttwu_local.min
47917 ± 29% -71.3% 13757 ± 83% numa-vmstat.node0.nr_active_anon
30393 ± 45% -71.0% 8807 ±111% numa-vmstat.node0.nr_anon_pages
78776 ± 10% -24.6% 59392 ± 7% numa-vmstat.node0.nr_file_pages
9592 ± 8% -10.3% 8602 ± 7% numa-vmstat.node0.nr_kernel_stack
2706 ± 37% -32.6% 1824 ± 38% numa-vmstat.node0.nr_mapped
18243 ± 45% -90.1% 1803 ±118% numa-vmstat.node0.nr_shmem
9147 ± 19% -42.9% 5227 ± 42% numa-vmstat.node0.nr_slab_reclaimable
18959 ± 7% -17.8% 15594 ± 14% numa-vmstat.node0.nr_slab_unreclaimable
47917 ± 29% -71.3% 13758 ± 83% numa-vmstat.node0.nr_zone_active_anon
34497 ± 41% +54.6% 53318 ± 19% numa-vmstat.node1.nr_active_anon
60479 ± 6% +9.6% 66283 ± 9% numa-vmstat.node1.nr_file_pages
1419 ±122% +162.7% 3729 ± 45% numa-vmstat.node1.nr_inactive_anon
1571 ±112% +441.7% 8510 ± 73% numa-vmstat.node1.nr_shmem
34497 ± 41% +54.6% 53318 ± 19% numa-vmstat.node1.nr_zone_active_anon
1419 ±122% +162.7% 3729 ± 45% numa-vmstat.node1.nr_zone_inactive_anon
9145 ±104% +188.9% 26423 ± 48% numa-vmstat.node3.nr_active_anon
1476 ± 52% +432.0% 7853 ±101% numa-vmstat.node3.nr_anon_pages
3918 ± 20% +78.8% 7005 ± 20% numa-vmstat.node3.nr_slab_reclaimable
12499 ± 7% +22.7% 15336 ± 12% numa-vmstat.node3.nr_slab_unreclaimable
9145 ±104% +188.9% 26423 ± 48% numa-vmstat.node3.nr_zone_active_anon
194801 ± 28% -71.7% 55131 ± 83% numa-meminfo.node0.Active
191699 ± 28% -71.2% 55131 ± 83% numa-meminfo.node0.Active(anon)
38555 ± 84% -100.0% 0.00 numa-meminfo.node0.AnonHugePages
121677 ± 45% -70.9% 35356 ±111% numa-meminfo.node0.AnonPages
315090 ± 10% -24.6% 237568 ± 7% numa-meminfo.node0.FilePages
36584 ± 19% -42.8% 20912 ± 42% numa-meminfo.node0.KReclaimable
9594 ± 8% -10.3% 8603 ± 7% numa-meminfo.node0.KernelStack
10625 ± 34% -32.6% 7165 ± 36% numa-meminfo.node0.Mapped
892127 ± 11% -25.0% 668837 ± 13% numa-meminfo.node0.MemUsed
36584 ± 19% -42.8% 20912 ± 42% numa-meminfo.node0.SReclaimable
75840 ± 7% -17.8% 62378 ± 14% numa-meminfo.node0.SUnreclaim
72956 ± 45% -90.1% 7212 ±118% numa-meminfo.node0.Shmem
112426 ± 11% -25.9% 83290 ± 21% numa-meminfo.node0.Slab
138083 ± 41% +54.5% 213397 ± 19% numa-meminfo.node1.Active(anon)
241919 ± 6% +9.6% 265074 ± 9% numa-meminfo.node1.FilePages
5679 ±122% +162.6% 14916 ± 45% numa-meminfo.node1.Inactive(anon)
6284 ±112% +440.7% 33980 ± 73% numa-meminfo.node1.Shmem
36547 ±104% +189.0% 105608 ± 48% numa-meminfo.node3.Active
36547 ±104% +189.0% 105608 ± 48% numa-meminfo.node3.Active(anon)
5876 ± 52% +433.8% 31369 ±101% numa-meminfo.node3.AnonPages
15671 ± 20% +78.8% 28021 ± 20% numa-meminfo.node3.KReclaimable
592511 ± 8% +15.3% 683394 ± 8% numa-meminfo.node3.MemUsed
15671 ± 20% +78.8% 28021 ± 20% numa-meminfo.node3.SReclaimable
49997 ± 7% +22.7% 61344 ± 12% numa-meminfo.node3.SUnreclaim
65670 ± 6% +36.1% 89366 ± 14% numa-meminfo.node3.Slab



will-it-scale.per_process_ops

1.84e+06 +-+--------------------------------------------------------------+
| |
1.82e+06 +-+.. .+.+.. .+..+.+.+..+.+.+.. .+.+.+.. .+.. .+..+. .|
1.8e+06 +-+ + +.+ + +.+ +.+ +.+..+ |
| |
1.78e+06 +-+ |
| |
1.76e+06 +-+ |
| |
1.74e+06 +-+ |
1.72e+06 +-+ O O O O O O O O O |
| O O O O O O O O O |
1.7e+06 +-+ O O |
O O O O O |
1.68e+06 +-+--------------------------------------------------------------+


will-it-scale.workload

3.5e+08 +-+--------------------------------------------------------------+
|.+..+.+ + + +.+ +.+ +..+.+.+..+.+.+..+.+. .+.|
3.45e+08 +-+ +.+ +. |
| |
| |
3.4e+08 +-+ |
| |
3.35e+08 +-+ |
| |
3.3e+08 +-+ O O O O O O O O O O O O O O |
| O O O O |
| O O |
3.25e+08 O-O O O O |
| |
3.2e+08 +-+--------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


Attachments:
(No filename) (11.83 kB)
config-5.0.0-rc4-00005-ge7ae097 (171.84 kB)
job-script (7.39 kB)
job.yaml (4.85 kB)
reproduce (321.00 B)
Download all attachments

2019-02-20 19:24:44

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH 4/6] mm/gup: track gup-pinned pages

On Sun, Feb 03, 2019 at 09:21:33PM -0800, [email protected] wrote:
> From: John Hubbard <[email protected]>
>

[snip]

>
> +/*
> + * GUP_PIN_COUNTING_BIAS, and the associated functions that use it, overload
> + * the page's refcount so that two separate items are tracked: the original page
> + * reference count, and also a new count of how many get_user_pages() calls were
> + * made against the page. ("gup-pinned" is another term for the latter).
> + *
> + * With this scheme, get_user_pages() becomes special: such pages are marked
> + * as distinct from normal pages. As such, the new put_user_page() call (and
> + * its variants) must be used in order to release gup-pinned pages.
> + *
> + * Choice of value:
> + *
> + * By making GUP_PIN_COUNTING_BIAS a power of two, debugging of page reference
> + * counts with respect to get_user_pages() and put_user_page() becomes simpler,
> + * due to the fact that adding an even power of two to the page refcount has
> + * the effect of using only the upper N bits, for the code that counts up using
> + * the bias value. This means that the lower bits are left for the exclusive
> + * use of the original code that increments and decrements by one (or at least,
> + * by much smaller values than the bias value).
> + *
> + * Of course, once the lower bits overflow into the upper bits (and this is
> + * OK, because subtraction recovers the original values), then visual inspection
> + * no longer suffices to directly view the separate counts. However, for normal
> + * applications that don't have huge page reference counts, this won't be an
> + * issue.
> + *
> + * This has to work on 32-bit as well as 64-bit systems. In the more constrained
> + * 32-bit systems, the 10 bit value of the bias value leaves 22 bits for the
> + * upper bits. Therefore, only about 4M calls to get_user_page() may occur for
> + * a page.
> + *
> + * Locking: the lockless algorithm described in page_cache_gup_pin_speculative()
> + * and page_cache_gup_pin_speculative() provides safe operation for

Did you mean:

page_cache_gup_pin_speculative and __ page_cache_get_speculative __?

Just found this while looking at your branch.

Sorry,
Ira

> + * get_user_pages and page_mkclean and other calls that race to set up page
> + * table entries.
> + */
> +#define GUP_PIN_COUNTING_BIAS (1UL << 10)
> +
> +int get_gup_pin_page(struct page *page);
> +
> +void put_user_page(struct page *page);
> +void put_user_pages_dirty(struct page **pages, unsigned long npages);
> +void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> +void put_user_pages(struct page **pages, unsigned long npages);
> +
> +/**
> + * page_gup_pinned() - report if a page is gup-pinned (pinned by a call to
> + * get_user_pages).
> + * @page: pointer to page to be queried.
> + * @Returns: True, if it is likely that the page has been "gup-pinned".
> + * False, if the page is definitely not gup-pinned.
> + */
> +static inline bool page_gup_pinned(struct page *page)
> +{
> + return (page_ref_count(page)) > GUP_PIN_COUNTING_BIAS;
> +}
> +
> static inline void get_page(struct page *page)
> {
> page = compound_head(page);
> @@ -993,30 +1050,6 @@ static inline void put_page(struct page *page)
> __put_page(page);
> }
>
> -/**
> - * put_user_page() - release a gup-pinned page
> - * @page: pointer to page to be released
> - *
> - * Pages that were pinned via get_user_pages*() must be released via
> - * either put_user_page(), or one of the put_user_pages*() routines
> - * below. This is so that eventually, pages that are pinned via
> - * get_user_pages*() can be separately tracked and uniquely handled. In
> - * particular, interactions with RDMA and filesystems need special
> - * handling.
> - *
> - * put_user_page() and put_page() are not interchangeable, despite this early
> - * implementation that makes them look the same. put_user_page() calls must
> - * be perfectly matched up with get_user_page() calls.
> - */
> -static inline void put_user_page(struct page *page)
> -{
> - put_page(page);
> -}
> -
> -void put_user_pages_dirty(struct page **pages, unsigned long npages);
> -void put_user_pages_dirty_lock(struct page **pages, unsigned long npages);
> -void put_user_pages(struct page **pages, unsigned long npages);
> -
> #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> #define SECTION_IN_PAGE_FLAGS
> #endif
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 5c8a9b59cbdc..5f5b72ba595f 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -209,6 +209,11 @@ static inline int page_cache_add_speculative(struct page *page, int count)
> return __page_cache_add_speculative(page, count);
> }
>
> +static inline int page_cache_gup_pin_speculative(struct page *page)
> +{
> + return __page_cache_add_speculative(page, GUP_PIN_COUNTING_BIAS);
> +}
> +
> #ifdef CONFIG_NUMA
> extern struct page *__page_cache_alloc(gfp_t gfp);
> #else
> diff --git a/mm/gup.c b/mm/gup.c
> index 05acd7e2eb22..3291da342f9c 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -25,6 +25,26 @@ struct follow_page_context {
> unsigned int page_mask;
> };
>
> +/**
> + * get_gup_pin_page() - mark a page as being used by get_user_pages().
> + * @page: pointer to page to be marked
> + * @Returns: 0 for success, -EOVERFLOW if the page refcount would have
> + * overflowed.
> + *
> + */
> +int get_gup_pin_page(struct page *page)
> +{
> + page = compound_head(page);
> +
> + if (page_ref_count(page) >= (UINT_MAX - GUP_PIN_COUNTING_BIAS)) {
> + WARN_ONCE(1, "get_user_pages pin count overflowed");
> + return -EOVERFLOW;
> + }
> +
> + page_ref_add(page, GUP_PIN_COUNTING_BIAS);
> + return 0;
> +}
> +
> static struct page *no_page_table(struct vm_area_struct *vma,
> unsigned int flags)
> {
> @@ -157,8 +177,14 @@ static struct page *follow_page_pte(struct vm_area_struct *vma,
> goto retry;
> }
>
> - if (flags & FOLL_GET)
> - get_page(page);
> + if (flags & FOLL_GET) {
> + int ret = get_gup_pin_page(page);
> +
> + if (ret) {
> + page = ERR_PTR(ret);
> + goto out;
> + }
> + }
> if (flags & FOLL_TOUCH) {
> if ((flags & FOLL_WRITE) &&
> !pte_dirty(pte) && !PageDirty(page))
> @@ -497,7 +523,10 @@ static int get_gate_page(struct mm_struct *mm, unsigned long address,
> if (is_device_public_page(*page))
> goto unmap;
> }
> - get_page(*page);
> +
> + ret = get_gup_pin_page(*page);
> + if (ret)
> + goto unmap;
> out:
> ret = 0;
> unmap:
> @@ -1429,11 +1458,11 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
> page = pte_page(pte);
> head = compound_head(page);
>
> - if (!page_cache_get_speculative(head))
> + if (!page_cache_gup_pin_speculative(head))
> goto pte_unmap;
>
> if (unlikely(pte_val(pte) != pte_val(*ptep))) {
> - put_page(head);
> + put_user_page(head);
> goto pte_unmap;
> }
>
> @@ -1488,7 +1517,11 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr,
> }
> SetPageReferenced(page);
> pages[*nr] = page;
> - get_page(page);
> + if (get_gup_pin_page(page)) {
> + undo_dev_pagemap(nr, nr_start, pages);
> + return 0;
> + }
> +
> (*nr)++;
> pfn++;
> } while (addr += PAGE_SIZE, addr != end);
> @@ -1569,15 +1602,14 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
> } while (addr += PAGE_SIZE, addr != end);
>
> head = compound_head(pmd_page(orig));
> - if (!page_cache_add_speculative(head, refs)) {
> + if (!page_cache_gup_pin_speculative(head)) {
> *nr -= refs;
> return 0;
> }
>
> if (unlikely(pmd_val(orig) != pmd_val(*pmdp))) {
> *nr -= refs;
> - while (refs--)
> - put_page(head);
> + put_user_page(head);
> return 0;
> }
>
> @@ -1607,15 +1639,14 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
> } while (addr += PAGE_SIZE, addr != end);
>
> head = compound_head(pud_page(orig));
> - if (!page_cache_add_speculative(head, refs)) {
> + if (!page_cache_gup_pin_speculative(head)) {
> *nr -= refs;
> return 0;
> }
>
> if (unlikely(pud_val(orig) != pud_val(*pudp))) {
> *nr -= refs;
> - while (refs--)
> - put_page(head);
> + put_user_page(head);
> return 0;
> }
>
> @@ -1644,15 +1675,14 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
> } while (addr += PAGE_SIZE, addr != end);
>
> head = compound_head(pgd_page(orig));
> - if (!page_cache_add_speculative(head, refs)) {
> + if (!page_cache_gup_pin_speculative(head)) {
> *nr -= refs;
> return 0;
> }
>
> if (unlikely(pgd_val(orig) != pgd_val(*pgdp))) {
> *nr -= refs;
> - while (refs--)
> - put_page(head);
> + put_user_page(head);
> return 0;
> }
>
> diff --git a/mm/swap.c b/mm/swap.c
> index 7c42ca45bb89..39b0ddd35933 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -133,6 +133,27 @@ void put_pages_list(struct list_head *pages)
> }
> EXPORT_SYMBOL(put_pages_list);
>
> +/**
> + * put_user_page() - release a gup-pinned page
> + * @page: pointer to page to be released
> + *
> + * Pages that were pinned via get_user_pages*() must be released via
> + * either put_user_page(), or one of the put_user_pages*() routines
> + * below. This is so that eventually, pages that are pinned via
> + * get_user_pages*() can be separately tracked and uniquely handled. In
> + * particular, interactions with RDMA and filesystems need special
> + * handling.
> + */
> +void put_user_page(struct page *page)
> +{
> + page = compound_head(page);
> +
> + VM_BUG_ON_PAGE(page_ref_count(page) < GUP_PIN_COUNTING_BIAS, page);
> +
> + page_ref_sub(page, GUP_PIN_COUNTING_BIAS);
> +}
> +EXPORT_SYMBOL(put_user_page);
> +
> typedef int (*set_dirty_func)(struct page *page);
>
> static void __put_user_pages_dirty(struct page **pages,
> --
> 2.20.1
>

2019-02-20 20:24:51

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 4/6] mm/gup: track gup-pinned pages

On 2/20/19 11:24 AM, Ira Weiny wrote:
> On Sun, Feb 03, 2019 at 09:21:33PM -0800, [email protected] wrote:
>> From: John Hubbard <[email protected]>
> [snip]
>> + *
>> + * Locking: the lockless algorithm described in page_cache_gup_pin_speculative()
>> + * and page_cache_gup_pin_speculative() provides safe operation for
>
> Did you mean:
>
> page_cache_gup_pin_speculative and __ page_cache_get_speculative __?
>
> Just found this while looking at your branch.
>
> Sorry,
> Ira
>

Hi Ira,

Yes, thanks for catching that. I've changed it in the git repo now, and it will
show up when the next spin of this patchset goes out.

thanks,
--
John Hubbard
NVIDIA

2019-02-28 12:25:10

by Chen, Rong A

[permalink] [raw]
Subject: [LKP] [mm/gup] cdaa813278: stress-ng.numa.ops_per_sec 4671.0% improvement

Greeting,

FYI, we noticed a 4671.0% improvement of stress-ng.numa.ops_per_sec due to commit:


commit: cdaa813278ddc616ee201eacda77f63996b5dd2d ("[PATCH 4/6] mm/gup: track gup-pinned pages")
url: https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/RFC-v2-mm-gup-dma-tracking/20190205-001101


in testcase: stress-ng
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
with following parameters:

nr_threads: 50%
disk: 1HDD
testtime: 5s
class: memory
cpufreq_governor: performance
ucode: 0xb00002e


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: ltp.cve-2017-18075.pass -100.0% undefined |
| test machine | qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=futex1 |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: |
| test machine | 192 threads Skylake-4S with 704G memory |
| test parameters | class=cpu |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=1s |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.numa.ops_per_sec 401.7% improvement |
| test machine | 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory |
| test parameters | class=pipe |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=1s |
+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: stress-ng.vm-splice.ops_per_sec -18.5% regression |
| test machine | 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory |
| test parameters | cpufreq_governor=performance |
| | mode=process |
| | nr_task=100% |
| | test=futex2 |
| | ucode=0xb00002e |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | class=os |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=1s |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.futex.ops_per_sec -58.0% regression |
| test machine | 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory |
| test parameters | class=vm |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=1s |
| | ucode=0xb00002e |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.vm-splice.ops -58.3% undefined |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters | class=pipe |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=60s |
| | ucode=0xb00002e |
+------------------+---------------------------------------------------------------------------+
| testcase: change | stress-ng: kernel_selftests.memfd.run_fuse_test.sh.pass -100.0% undefined |
| test machine | qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 4G |
| test parameters | class=pipe |
| | cpufreq_governor=performance |
| | disk=1HDD |
| | nr_threads=100% |
| | testtime=60s |
+------------------+---------------------------------------------------------------------------+
| testcase: change | kvm-unit-tests: stress-ng.vm-splice.ops -99.3% undefined |
| test machine | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
+------------------+---------------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml

=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
memory/gcc-7/performance/1HDD/x86_64-rhel-7.2/50%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3/stress-ng/5s/0xb00002e

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
%stddev %change %stddev
\ | \
17845 ± 2% +25.1% 22326 ± 22% stress-ng.memcpy.ops
3568 ± 2% +25.1% 4464 ± 22% stress-ng.memcpy.ops_per_sec
55.50 ± 2% +3969.8% 2258 ± 3% stress-ng.numa.ops
9.41 ± 4% +4671.0% 449.07 ± 3% stress-ng.numa.ops_per_sec
277857 ± 5% +50.9% 419386 ± 4% stress-ng.time.involuntary_context_switches
326.50 ± 10% -16.2% 273.50 ± 3% stress-ng.vm-addr.ops
65.30 ± 10% -16.2% 54.71 ± 3% stress-ng.vm-addr.ops_per_sec
0.01 ±113% +0.0 0.02 ± 36% mpstat.cpu.iowait%
181260 +15.7% 209701 vmstat.system.in
64963 ± 32% +315.5% 269897 ± 38% numa-numastat.node0.other_node
66083 ± 33% +904.9% 664072 ± 18% numa-numastat.node1.other_node
1670 ± 2% +36.4% 2279 ± 3% slabinfo.numa_policy.active_objs
1670 ± 2% +36.4% 2279 ± 3% slabinfo.numa_policy.num_objs
38695830 +29.5% 50110710 turbostat.IRQ
10.08 -2.9% 9.79 turbostat.RAMWatt
917912 ± 10% +77.1% 1625317 ± 27% meminfo.Active
917736 ± 10% +77.1% 1624942 ± 27% meminfo.Active(anon)
425637 ± 10% +43.8% 611934 ± 18% meminfo.Inactive
425296 ± 10% +43.8% 611561 ± 18% meminfo.Inactive(anon)
3195353 ± 3% +26.7% 4047153 ± 14% meminfo.Memused
49.89 ± 5% -26.7% 36.55 ± 6% perf-stat.i.MPKI
4.372e+08 ± 2% -5.1% 4.151e+08 ± 3% perf-stat.i.cache-references
9.75 ± 5% -8.8% 8.90 ± 3% perf-stat.i.cpi
4.84 ± 3% -8.2% 4.44 ± 4% perf-stat.overall.MPKI
4.361e+08 ± 2% -5.2% 4.137e+08 ± 3% perf-stat.ps.cache-references
27667 ± 19% +32.4% 36627 ± 17% softirqs.CPU0.RCU
24079 ± 6% -12.6% 21054 ± 8% softirqs.CPU24.SCHED
29249 ± 4% +14.0% 33351 ± 4% softirqs.CPU25.RCU
29618 ± 4% +16.8% 34600 ± 8% softirqs.CPU27.RCU
37158 ± 3% -12.6% 32476 ± 12% softirqs.CPU69.RCU
446518 ± 23% +87.1% 835393 ± 18% numa-meminfo.node0.Active
446388 ± 23% +87.1% 835224 ± 18% numa-meminfo.node0.Active(anon)
279797 ± 60% +81.1% 506718 ± 7% numa-meminfo.node0.Inactive
279545 ± 60% +81.2% 506611 ± 7% numa-meminfo.node0.Inactive(anon)
57425 ± 7% +16.8% 67077 ± 3% numa-meminfo.node0.KReclaimable
1666995 ± 17% +37.9% 2299473 ± 4% numa-meminfo.node0.MemUsed
57425 ± 7% +16.8% 67077 ± 3% numa-meminfo.node0.SReclaimable
59307 ± 7% -18.7% 48201 ± 4% numa-meminfo.node1.KReclaimable
59307 ± 7% -18.7% 48201 ± 4% numa-meminfo.node1.SReclaimable
114842 ± 19% +86.4% 214054 ± 15% numa-vmstat.node0.nr_active_anon
71405 ± 61% +78.6% 127550 ± 8% numa-vmstat.node0.nr_inactive_anon
257.00 ± 12% -43.0% 146.50 ± 30% numa-vmstat.node0.nr_isolated_anon
14332 ± 7% +17.2% 16792 ± 3% numa-vmstat.node0.nr_slab_reclaimable
114840 ± 19% +86.4% 214050 ± 15% numa-vmstat.node0.nr_zone_active_anon
71404 ± 61% +78.6% 127549 ± 8% numa-vmstat.node0.nr_zone_inactive_anon
39507 ± 32% +250.0% 138280 ± 37% numa-vmstat.node0.numa_other
313.75 ± 5% -57.1% 134.50 ± 25% numa-vmstat.node1.nr_isolated_anon
14816 ± 7% -18.5% 12069 ± 4% numa-vmstat.node1.nr_slab_reclaimable
172051 ± 7% +177.8% 477913 ± 13% numa-vmstat.node1.numa_other
231722 ± 8% +72.6% 399865 ± 27% proc-vmstat.nr_active_anon
291.00 ± 16% +18.6% 345.00 ± 19% proc-vmstat.nr_anon_transparent_hugepages
107048 ± 8% +43.8% 153932 ± 19% proc-vmstat.nr_inactive_anon
643.25 ± 6% -55.9% 283.75 ± 25% proc-vmstat.nr_isolated_anon
231722 ± 8% +72.6% 399865 ± 27% proc-vmstat.nr_zone_active_anon
107048 ± 8% +43.8% 153932 ± 19% proc-vmstat.nr_zone_inactive_anon
131052 ± 3% +612.7% 933974 ± 3% proc-vmstat.numa_other
6.265e+08 ± 3% -13.6% 5.411e+08 ± 5% proc-vmstat.pgalloc_normal
6.264e+08 ± 3% -13.7% 5.403e+08 ± 5% proc-vmstat.pgfree
7486 ±122% +2087.9% 163793 ± 6% proc-vmstat.pgmigrate_fail
1047865 ± 18% +134.0% 2452064 ± 15% proc-vmstat.pgmigrate_success
234795 ± 26% -60.5% 92683 ± 78% proc-vmstat.thp_deferred_split_page
417.00 ±145% +3214.7% 13822 ±144% proc-vmstat.unevictable_pgs_cleared
417.50 ±144% +3211.2% 13824 ±144% proc-vmstat.unevictable_pgs_stranded
46.55 -46.6 0.00 perf-profile.calltrace.cycles-pp.__x64_sys_move_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.55 -46.6 0.00 perf-profile.calltrace.cycles-pp.kernel_move_pages.__x64_sys_move_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.47 -46.5 0.00 perf-profile.calltrace.cycles-pp.do_move_pages_to_node.kernel_move_pages.__x64_sys_move_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
46.47 -46.5 0.00 perf-profile.calltrace.cycles-pp.migrate_pages.do_move_pages_to_node.kernel_move_pages.__x64_sys_move_pages.do_syscall_64
45.32 -45.3 0.00 perf-profile.calltrace.cycles-pp.move_to_new_page.migrate_pages.do_move_pages_to_node.kernel_move_pages.__x64_sys_move_pages
45.32 -45.3 0.00 perf-profile.calltrace.cycles-pp.migrate_page.move_to_new_page.migrate_pages.do_move_pages_to_node.kernel_move_pages
45.22 -45.2 0.00 perf-profile.calltrace.cycles-pp.migrate_page_copy.migrate_page.move_to_new_page.migrate_pages.do_move_pages_to_node
43.53 -43.0 0.55 ± 62% perf-profile.calltrace.cycles-pp.copy_page.migrate_page_copy.migrate_page.move_to_new_page.migrate_pages
78.97 ± 2% -6.3 72.65 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
79.20 ± 2% -6.2 72.97 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
5.01 ± 3% -0.9 4.11 ± 3% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.01 ± 3% -0.9 4.12 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.94 ± 3% -0.9 4.05 ± 3% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
4.99 ± 3% -0.9 4.10 ± 3% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.48 ± 3% -0.5 2.02 ± 3% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
2.47 ± 3% -0.5 2.02 ± 3% perf-profile.calltrace.cycles-pp.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
2.37 ± 3% -0.4 1.93 ± 3% perf-profile.calltrace.cycles-pp.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region.__do_munmap
2.35 ± 3% -0.4 1.92 ± 3% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu.unmap_region
2.38 ± 2% -0.4 1.96 ± 3% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain.unmap_region.__do_munmap
2.38 ± 3% -0.4 1.96 ± 3% perf-profile.calltrace.cycles-pp.lru_add_drain_cpu.lru_add_drain.unmap_region.__do_munmap.__vm_munmap
2.38 ± 3% -0.4 1.96 ± 3% perf-profile.calltrace.cycles-pp.lru_add_drain.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
2.21 ± 3% -0.4 1.80 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu.tlb_finish_mmu
2.19 ± 3% -0.4 1.78 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu_free.arch_tlb_finish_mmu
2.23 ± 2% -0.4 1.83 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain.unmap_region
2.20 ± 2% -0.4 1.81 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.lru_add_drain_cpu.lru_add_drain
0.58 ± 5% +0.2 0.77 ± 4% perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.__vfs_read.vfs_read.ksys_read
0.58 ± 7% +0.2 0.78 ± 4% perf-profile.calltrace.cycles-pp.anon_pipe_buf_release.pipe_read.__vfs_read.vfs_read.ksys_read
0.70 ± 16% +0.2 0.92 ± 6% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read
0.65 ± 5% +0.2 0.88 perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
0.66 ± 5% +0.2 0.89 ± 2% perf-profile.calltrace.cycles-pp.selinux_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.70 ± 16% +0.2 0.94 ± 6% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.__vfs_read
0.81 ± 5% +0.3 1.10 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__wake_up_common_lock.pipe_read.__vfs_read.vfs_read
0.83 ± 14% +0.3 1.12 ± 4% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.__vfs_read.vfs_read
0.40 ± 57% +0.3 0.71 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.__vfs_write.vfs_write.ksys_write
0.58 ± 5% +0.3 0.92 ± 28% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter.pipe_write.__vfs_write
0.28 ±100% +0.4 0.64 ± 8% perf-profile.calltrace.cycles-pp.pipe_wait.pipe_write.__vfs_write.vfs_write.ksys_write
0.65 ± 5% +0.4 1.02 ± 25% perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter.pipe_write.__vfs_write.vfs_write
0.26 ±100% +0.4 0.65 ± 4% perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.__vfs_read.vfs_read
1.04 ± 5% +0.4 1.44 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.__vfs_write.vfs_write.ksys_write
1.06 ± 5% +0.4 1.46 ± 2% perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.__vfs_read.vfs_read.ksys_read
1.21 ± 5% +0.4 1.64 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.14 ± 8% +0.5 1.61 perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.__vfs_read.vfs_read.ksys_read
1.30 ± 5% +0.5 1.78 ± 3% perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.pipe_read.__vfs_read
1.16 ± 9% +0.5 1.64 perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.__vfs_write.vfs_write
0.13 ±173% +0.5 0.63 ± 7% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
1.37 ± 6% +0.5 1.86 ± 2% perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.pipe_read.__vfs_read.vfs_read
1.34 ± 6% +0.5 1.85 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.13 ±173% +0.5 0.64 perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_write.ksys_write.do_syscall_64
0.14 ±173% +0.5 0.67 ± 2% perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.00 +0.5 0.54 ± 5% perf-profile.calltrace.cycles-pp.rmap_walk_file.remove_migration_ptes.migrate_pages.migrate_to_node.do_migrate_pages
1.25 ± 4% +0.6 1.82 ± 14% perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.__vfs_write.vfs_write.ksys_write
0.00 +0.6 0.59 ± 4% perf-profile.calltrace.cycles-pp.fsnotify.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +0.6 0.59 ± 6% perf-profile.calltrace.cycles-pp.rmap_walk_anon.try_to_unmap.migrate_pages.migrate_to_node.do_migrate_pages
0.00 +0.6 0.60 ± 8% perf-profile.calltrace.cycles-pp.find_next_bit.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64
1.65 ± 7% +0.6 2.27 ± 4% perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.__vfs_write.vfs_write.ksys_write
0.00 +0.6 0.63 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock.queue_pages_pte_range.__walk_page_range.walk_page_range.queue_pages_range
0.00 +0.7 0.65 ± 5% perf-profile.calltrace.cycles-pp.remove_migration_ptes.migrate_pages.migrate_to_node.do_migrate_pages.kernel_migrate_pages
1.89 ± 7% +0.7 2.56 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.__vfs_read.vfs_read.ksys_read
0.00 +0.7 0.67 ± 5% perf-profile.calltrace.cycles-pp.queue_pages_test_walk.walk_page_range.queue_pages_range.migrate_to_node.do_migrate_pages
2.05 ± 5% +0.7 2.80 ± 2% perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.__vfs_read.vfs_read.ksys_read
0.00 +0.8 0.82 ± 23% perf-profile.calltrace.cycles-pp.migrate_page.move_to_new_page.migrate_pages.migrate_to_node.do_migrate_pages
0.00 +0.8 0.82 ± 23% perf-profile.calltrace.cycles-pp.move_to_new_page.migrate_pages.migrate_to_node.do_migrate_pages.kernel_migrate_pages
0.00 +0.8 0.85 ± 5% perf-profile.calltrace.cycles-pp._vm_normal_page.queue_pages_pte_range.__walk_page_range.walk_page_range.queue_pages_range
2.79 ± 4% +0.9 3.68 ± 2% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
0.00 +0.9 0.89 ± 3% perf-profile.calltrace.cycles-pp.bitmap_ord_to_pos.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64
2.80 ± 6% +0.9 3.73 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
0.00 +1.2 1.21 ± 7% perf-profile.calltrace.cycles-pp.smp_call_function_single.on_each_cpu_mask.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush
0.00 +1.3 1.25 ± 7% perf-profile.calltrace.cycles-pp.on_each_cpu_mask.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_unmap_one
0.00 +1.4 1.38 ± 7% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_unmap_one.rmap_walk_file
3.73 ± 7% +1.4 5.11 ± 2% perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.__vfs_write.vfs_write.ksys_write
0.00 +1.6 1.56 ± 7% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.ptep_clear_flush.try_to_unmap_one.rmap_walk_file.try_to_unmap
0.00 +1.6 1.63 ± 7% perf-profile.calltrace.cycles-pp.ptep_clear_flush.try_to_unmap_one.rmap_walk_file.try_to_unmap.migrate_pages
0.00 +2.2 2.20 ± 6% perf-profile.calltrace.cycles-pp.try_to_unmap_one.rmap_walk_file.try_to_unmap.migrate_pages.migrate_to_node
0.00 +2.3 2.30 ± 6% perf-profile.calltrace.cycles-pp.rmap_walk_file.try_to_unmap.migrate_pages.migrate_to_node.do_migrate_pages
0.00 +2.9 2.90 ± 6% perf-profile.calltrace.cycles-pp.try_to_unmap.migrate_pages.migrate_to_node.do_migrate_pages.kernel_migrate_pages
8.94 ± 5% +3.4 12.33 perf-profile.calltrace.cycles-pp.pipe_read.__vfs_read.vfs_read.ksys_read.do_syscall_64
9.24 ± 5% +3.5 12.73 perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.65 ± 18% +3.7 17.34 ± 6% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
13.65 ± 18% +3.7 17.35 ± 6% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
13.65 ± 18% +3.7 17.35 ± 6% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
13.89 ± 17% +3.9 17.84 ± 6% perf-profile.calltrace.cycles-pp.secondary_startup_64
10.58 ± 5% +4.0 14.54 perf-profile.calltrace.cycles-pp.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64
10.90 ± 5% +4.1 14.95 perf-profile.calltrace.cycles-pp.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
11.36 ± 5% +4.3 15.62 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
11.83 ± 5% +4.4 16.24 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +4.5 4.54 ± 5% perf-profile.calltrace.cycles-pp.migrate_pages.migrate_to_node.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages
12.69 ± 5% +4.7 17.35 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
13.15 ± 5% +4.8 17.99 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +5.1 5.10 ± 3% perf-profile.calltrace.cycles-pp.__bitmap_weight.bitmap_bitremap.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages
0.00 +6.7 6.65 ± 4% perf-profile.calltrace.cycles-pp.__bitmap_weight.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64
0.00 +7.9 7.85 ± 4% perf-profile.calltrace.cycles-pp.bitmap_bitremap.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64
0.00 +8.3 8.31 ± 6% perf-profile.calltrace.cycles-pp.queue_pages_pte_range.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
0.00 +9.4 9.39 ± 6% perf-profile.calltrace.cycles-pp.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node.do_migrate_pages
0.00 +10.3 10.28 ± 6% perf-profile.calltrace.cycles-pp.walk_page_range.queue_pages_range.migrate_to_node.do_migrate_pages.kernel_migrate_pages
0.00 +10.3 10.31 ± 6% perf-profile.calltrace.cycles-pp.queue_pages_range.migrate_to_node.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages
0.63 ± 6% +14.3 14.88 ± 4% perf-profile.calltrace.cycles-pp.migrate_to_node.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64
1.25 ± 3% +31.2 32.42 ± 4% perf-profile.calltrace.cycles-pp.do_migrate_pages.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.26 ± 3% +31.4 32.62 ± 4% perf-profile.calltrace.cycles-pp.__x64_sys_migrate_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
1.26 ± 3% +31.4 32.62 ± 4% perf-profile.calltrace.cycles-pp.kernel_migrate_pages.__x64_sys_migrate_pages.do_syscall_64.entry_SYSCALL_64_after_hwframe
360536 ± 3% +1526.6% 5864634 ± 6% interrupts.CAL:Function_call_interrupts
4122 ± 31% +65.0% 6803 interrupts.CPU0.NMI:Non-maskable_interrupts
4122 ± 31% +65.0% 6803 interrupts.CPU0.PMI:Performance_monitoring_interrupts
6247 ± 9% +23.4% 7708 ± 17% interrupts.CPU0.RES:Rescheduling_interrupts
3735 ± 15% +2257.0% 88044 ± 33% interrupts.CPU1.CAL:Function_call_interrupts
6593 ± 16% +23.6% 8150 ± 11% interrupts.CPU1.RES:Rescheduling_interrupts
1605 ± 48% +5360.4% 87680 ± 33% interrupts.CPU1.TLB:TLB_shootdowns
3668 ± 27% +2395.9% 91551 ± 17% interrupts.CPU10.CAL:Function_call_interrupts
3705 ± 45% +87.8% 6957 ± 14% interrupts.CPU10.NMI:Non-maskable_interrupts
3705 ± 45% +87.8% 6957 ± 14% interrupts.CPU10.PMI:Performance_monitoring_interrupts
3692 ± 5% +50.9% 5571 ± 2% interrupts.CPU10.RES:Rescheduling_interrupts
1194 ± 74% +7530.5% 91146 ± 18% interrupts.CPU10.TLB:TLB_shootdowns
3833 ± 29% +2767.2% 109921 ± 10% interrupts.CPU11.CAL:Function_call_interrupts
3631 ± 2% +86.9% 6787 ± 8% interrupts.CPU11.RES:Rescheduling_interrupts
1569 ± 76% +6909.2% 109992 ± 11% interrupts.CPU11.TLB:TLB_shootdowns
3557 ± 35% +2148.0% 79971 ± 54% interrupts.CPU12.CAL:Function_call_interrupts
3760 ± 11% +58.0% 5942 ± 22% interrupts.CPU12.RES:Rescheduling_interrupts
1104 ±119% +7079.3% 79277 ± 56% interrupts.CPU12.TLB:TLB_shootdowns
3639 ± 32% +2257.8% 85817 ± 28% interrupts.CPU13.CAL:Function_call_interrupts
3497 ± 65% +76.1% 6159 ± 26% interrupts.CPU13.NMI:Non-maskable_interrupts
3497 ± 65% +76.1% 6159 ± 26% interrupts.CPU13.PMI:Performance_monitoring_interrupts
3799 ± 3% +46.3% 5558 ± 11% interrupts.CPU13.RES:Rescheduling_interrupts
1329 ± 89% +6373.1% 86075 ± 29% interrupts.CPU13.TLB:TLB_shootdowns
3942 ± 31% +2512.7% 103013 ± 12% interrupts.CPU14.CAL:Function_call_interrupts
3736 ± 4% +76.7% 6601 ± 14% interrupts.CPU14.RES:Rescheduling_interrupts
1491 ± 75% +6811.2% 103097 ± 12% interrupts.CPU14.TLB:TLB_shootdowns
3605 ± 33% +2212.0% 83354 ± 58% interrupts.CPU15.CAL:Function_call_interrupts
3624 ± 62% +85.4% 6720 ± 22% interrupts.CPU15.NMI:Non-maskable_interrupts
3624 ± 62% +85.4% 6720 ± 22% interrupts.CPU15.PMI:Performance_monitoring_interrupts
3751 ± 3% +41.9% 5325 ± 19% interrupts.CPU15.RES:Rescheduling_interrupts
1208 ± 98% +6788.3% 83227 ± 59% interrupts.CPU15.TLB:TLB_shootdowns
3738 ± 28% +2823.1% 109274 ± 29% interrupts.CPU16.CAL:Function_call_interrupts
3584 ± 3% +73.9% 6232 ± 17% interrupts.CPU16.RES:Rescheduling_interrupts
1749 ± 76% +6151.7% 109343 ± 29% interrupts.CPU16.TLB:TLB_shootdowns
3590 ± 36% +2164.9% 81327 ± 44% interrupts.CPU17.CAL:Function_call_interrupts
3515 ± 66% +67.5% 5887 ± 31% interrupts.CPU17.NMI:Non-maskable_interrupts
3515 ± 66% +67.5% 5887 ± 31% interrupts.CPU17.PMI:Performance_monitoring_interrupts
3602 ± 2% +59.5% 5746 ± 14% interrupts.CPU17.RES:Rescheduling_interrupts
1191 ±109% +6686.6% 80845 ± 46% interrupts.CPU17.TLB:TLB_shootdowns
4548 ± 23% +2020.8% 96469 ± 48% interrupts.CPU18.CAL:Function_call_interrupts
3822 ± 8% +52.3% 5822 ± 21% interrupts.CPU18.RES:Rescheduling_interrupts
2195 ± 50% +4291.9% 96414 ± 48% interrupts.CPU18.TLB:TLB_shootdowns
3596 ± 15% +1942.4% 73450 ± 82% interrupts.CPU2.CAL:Function_call_interrupts
5367 ± 6% +33.2% 7150 ± 6% interrupts.CPU2.NMI:Non-maskable_interrupts
5367 ± 6% +33.2% 7150 ± 6% interrupts.CPU2.PMI:Performance_monitoring_interrupts
3608 ± 33% +1494.0% 57512 ± 48% interrupts.CPU20.CAL:Function_call_interrupts
3723 ± 3% +36.6% 5085 ± 11% interrupts.CPU20.RES:Rescheduling_interrupts
1159 ± 97% +4754.8% 56304 ± 50% interrupts.CPU20.TLB:TLB_shootdowns
3686 ± 30% +2621.0% 100315 ± 33% interrupts.CPU21.CAL:Function_call_interrupts
3574 ± 67% +77.6% 6349 ± 25% interrupts.CPU21.NMI:Non-maskable_interrupts
3574 ± 67% +77.6% 6349 ± 25% interrupts.CPU21.PMI:Performance_monitoring_interrupts
3835 ± 13% +65.2% 6337 ± 15% interrupts.CPU21.RES:Rescheduling_interrupts
1162 ± 98% +8509.6% 100065 ± 34% interrupts.CPU21.TLB:TLB_shootdowns
3687 ± 27% +2336.0% 89813 ± 50% interrupts.CPU22.CAL:Function_call_interrupts
3810 +83.5% 6993 ± 27% interrupts.CPU22.RES:Rescheduling_interrupts
1484 ± 66% +5953.7% 89867 ± 51% interrupts.CPU22.TLB:TLB_shootdowns
3974 ± 17% +2750.3% 113270 ± 15% interrupts.CPU23.CAL:Function_call_interrupts
3926 ± 11% +80.7% 7096 ± 9% interrupts.CPU23.RES:Rescheduling_interrupts
1464 ± 53% +7676.5% 113848 ± 15% interrupts.CPU23.TLB:TLB_shootdowns
4345 ± 14% +2287.9% 103770 ± 13% interrupts.CPU24.CAL:Function_call_interrupts
4072 ± 9% +84.4% 7507 ± 13% interrupts.CPU24.RES:Rescheduling_interrupts
1858 ± 38% +5495.7% 103967 ± 14% interrupts.CPU24.TLB:TLB_shootdowns
3807 ± 26% +2463.3% 97585 ± 30% interrupts.CPU25.CAL:Function_call_interrupts
3852 ± 4% +78.3% 6870 ± 14% interrupts.CPU25.RES:Rescheduling_interrupts
1368 ± 72% +7011.0% 97296 ± 31% interrupts.CPU25.TLB:TLB_shootdowns
3574 ± 21% +1729.3% 65390 ± 48% interrupts.CPU26.CAL:Function_call_interrupts
6245 ± 7% -30.6% 4334 ± 26% interrupts.CPU26.NMI:Non-maskable_interrupts
6245 ± 7% -30.6% 4334 ± 26% interrupts.CPU26.PMI:Performance_monitoring_interrupts
4092 ± 18% +37.4% 5623 ± 14% interrupts.CPU26.RES:Rescheduling_interrupts
1168 ± 64% +5429.2% 64594 ± 50% interrupts.CPU26.TLB:TLB_shootdowns
6377 ± 10% -34.5% 4179 ± 23% interrupts.CPU27.NMI:Non-maskable_interrupts
6377 ± 10% -34.5% 4179 ± 23% interrupts.CPU27.PMI:Performance_monitoring_interrupts
6425 ± 14% -38.7% 3937 ± 23% interrupts.CPU28.NMI:Non-maskable_interrupts
6425 ± 14% -38.7% 3937 ± 23% interrupts.CPU28.PMI:Performance_monitoring_interrupts
4880 ± 4% +894.4% 48528 ± 19% interrupts.CPU29.CAL:Function_call_interrupts
3951 ± 6% +25.6% 4962 ± 6% interrupts.CPU29.RES:Rescheduling_interrupts
2365 ± 13% +1898.5% 47274 ± 20% interrupts.CPU29.TLB:TLB_shootdowns
3890 ± 13% +3046.4% 122404 ± 11% interrupts.CPU3.CAL:Function_call_interrupts
4834 ± 22% +53.2% 7407 ± 4% interrupts.CPU3.NMI:Non-maskable_interrupts
4834 ± 22% +53.2% 7407 ± 4% interrupts.CPU3.PMI:Performance_monitoring_interrupts
4600 ± 3% +89.1% 8699 ± 25% interrupts.CPU3.RES:Rescheduling_interrupts
1585 ± 50% +7626.2% 122460 ± 12% interrupts.CPU3.TLB:TLB_shootdowns
4598 ± 15% +1179.7% 58846 ± 45% interrupts.CPU31.CAL:Function_call_interrupts
3672 ± 3% +61.8% 5941 ± 14% interrupts.CPU31.RES:Rescheduling_interrupts
2274 ± 27% +2438.3% 57732 ± 47% interrupts.CPU31.TLB:TLB_shootdowns
4554 ± 21% +1114.9% 55329 ± 56% interrupts.CPU32.CAL:Function_call_interrupts
3741 +44.6% 5408 ± 21% interrupts.CPU32.RES:Rescheduling_interrupts
2105 ± 41% +2460.0% 53894 ± 59% interrupts.CPU32.TLB:TLB_shootdowns
3907 ± 30% +860.8% 37539 ± 65% interrupts.CPU35.CAL:Function_call_interrupts
3871 ± 9% +22.8% 4752 ± 18% interrupts.CPU35.RES:Rescheduling_interrupts
1638 ± 73% +2093.4% 35938 ± 68% interrupts.CPU35.TLB:TLB_shootdowns
4647 ± 23% +1627.4% 80281 ± 12% interrupts.CPU36.CAL:Function_call_interrupts
3821 ± 5% +57.4% 6015 ± 9% interrupts.CPU36.RES:Rescheduling_interrupts
2181 ± 45% +3557.4% 79776 ± 12% interrupts.CPU36.TLB:TLB_shootdowns
7070 ± 14% -30.3% 4924 ± 24% interrupts.CPU38.NMI:Non-maskable_interrupts
7070 ± 14% -30.3% 4924 ± 24% interrupts.CPU38.PMI:Performance_monitoring_interrupts
3711 ± 12% +1962.2% 76543 ± 53% interrupts.CPU4.CAL:Function_call_interrupts
4987 ± 21% +43.4% 7152 ± 3% interrupts.CPU4.NMI:Non-maskable_interrupts
4987 ± 21% +43.4% 7152 ± 3% interrupts.CPU4.PMI:Performance_monitoring_interrupts
3815 ± 2% +60.4% 6121 ± 20% interrupts.CPU4.RES:Rescheduling_interrupts
1324 ± 33% +5629.5% 75872 ± 55% interrupts.CPU4.TLB:TLB_shootdowns
4648 ± 22% +814.8% 42517 ± 39% interrupts.CPU43.CAL:Function_call_interrupts
2282 ± 35% +1704.8% 41186 ± 41% interrupts.CPU43.TLB:TLB_shootdowns
3602 ± 29% +2533.9% 94873 ± 42% interrupts.CPU45.CAL:Function_call_interrupts
4435 ± 44% +62.7% 7217 ± 10% interrupts.CPU45.NMI:Non-maskable_interrupts
4435 ± 44% +62.7% 7217 ± 10% interrupts.CPU45.PMI:Performance_monitoring_interrupts
3712 ± 4% +57.8% 5856 ± 21% interrupts.CPU45.RES:Rescheduling_interrupts
1180 ± 98% +7913.7% 94601 ± 44% interrupts.CPU45.TLB:TLB_shootdowns
4762 ± 26% +49.6% 7124 ± 12% interrupts.CPU46.NMI:Non-maskable_interrupts
4762 ± 26% +49.6% 7124 ± 12% interrupts.CPU46.PMI:Performance_monitoring_interrupts
1165 ±115% +5808.7% 68836 ± 84% interrupts.CPU47.TLB:TLB_shootdowns
3486 ± 30% +2453.1% 89006 ± 60% interrupts.CPU48.CAL:Function_call_interrupts
4565 ± 46% +55.6% 7104 ± 14% interrupts.CPU48.NMI:Non-maskable_interrupts
4565 ± 46% +55.6% 7104 ± 14% interrupts.CPU48.PMI:Performance_monitoring_interrupts
3695 ± 2% +70.3% 6295 ± 20% interrupts.CPU48.RES:Rescheduling_interrupts
1073 ±111% +8148.9% 88511 ± 62% interrupts.CPU48.TLB:TLB_shootdowns
4106 ± 47% +69.1% 6944 ± 14% interrupts.CPU49.NMI:Non-maskable_interrupts
4106 ± 47% +69.1% 6944 ± 14% interrupts.CPU49.PMI:Performance_monitoring_interrupts
1185 ± 95% +5391.3% 65112 ± 98% interrupts.CPU49.TLB:TLB_shootdowns
3601 ± 8% +2568.7% 96098 ± 29% interrupts.CPU5.CAL:Function_call_interrupts
3785 ± 26% +91.1% 7233 ± 3% interrupts.CPU5.NMI:Non-maskable_interrupts
3785 ± 26% +91.1% 7233 ± 3% interrupts.CPU5.PMI:Performance_monitoring_interrupts
3668 ± 2% +74.1% 6386 ± 16% interrupts.CPU5.RES:Rescheduling_interrupts
1137 ± 29% +8322.2% 95802 ± 30% interrupts.CPU5.TLB:TLB_shootdowns
3745 ± 41% +84.0% 6890 ± 18% interrupts.CPU50.NMI:Non-maskable_interrupts
3745 ± 41% +84.0% 6890 ± 18% interrupts.CPU50.PMI:Performance_monitoring_interrupts
654.50 ± 68% +12823.4% 84583 ± 66% interrupts.CPU50.TLB:TLB_shootdowns
3754 ± 59% +84.3% 6921 ± 18% interrupts.CPU51.NMI:Non-maskable_interrupts
3754 ± 59% +84.3% 6921 ± 18% interrupts.CPU51.PMI:Performance_monitoring_interrupts
3616 ± 4% +63.4% 5907 ± 23% interrupts.CPU51.RES:Rescheduling_interrupts
1051 ±115% +7965.0% 84783 ± 67% interrupts.CPU51.TLB:TLB_shootdowns
3537 ± 35% +1280.8% 48845 ± 62% interrupts.CPU52.CAL:Function_call_interrupts
1067 ±112% +4353.9% 47545 ± 65% interrupts.CPU52.TLB:TLB_shootdowns
1112 ±109% +4960.5% 56285 ± 97% interrupts.CPU53.TLB:TLB_shootdowns
3841 ± 56% +80.5% 6932 ± 20% interrupts.CPU54.NMI:Non-maskable_interrupts
3841 ± 56% +80.5% 6932 ± 20% interrupts.CPU54.PMI:Performance_monitoring_interrupts
3561 +69.5% 6036 ± 23% interrupts.CPU54.RES:Rescheduling_interrupts
3543 ± 32% +1357.6% 51648 ± 60% interrupts.CPU55.CAL:Function_call_interrupts
3537 ± 2% +39.5% 4933 ± 13% interrupts.CPU55.RES:Rescheduling_interrupts
1144 ±103% +4304.4% 50397 ± 63% interrupts.CPU55.TLB:TLB_shootdowns
3550 ± 35% +2444.4% 90343 ± 56% interrupts.CPU56.CAL:Function_call_interrupts
3650 ± 65% +94.0% 7082 ± 11% interrupts.CPU56.NMI:Non-maskable_interrupts
3650 ± 65% +94.0% 7082 ± 11% interrupts.CPU56.PMI:Performance_monitoring_interrupts
3651 +65.0% 6023 ± 23% interrupts.CPU56.RES:Rescheduling_interrupts
1136 ±112% +7824.5% 90062 ± 58% interrupts.CPU56.TLB:TLB_shootdowns
2575 ± 33% +170.5% 6966 ± 17% interrupts.CPU57.NMI:Non-maskable_interrupts
2575 ± 33% +170.5% 6966 ± 17% interrupts.CPU57.PMI:Performance_monitoring_interrupts
3533 ± 3% +63.2% 5768 ± 24% interrupts.CPU57.RES:Rescheduling_interrupts
1119 ±104% +7722.0% 87547 ± 60% interrupts.CPU57.TLB:TLB_shootdowns
3626 +52.3% 5525 ± 18% interrupts.CPU58.RES:Rescheduling_interrupts
3579 ± 35% +2484.6% 92516 ± 52% interrupts.CPU59.CAL:Function_call_interrupts
2676 ± 32% +152.7% 6762 ± 23% interrupts.CPU59.NMI:Non-maskable_interrupts
2676 ± 32% +152.7% 6762 ± 23% interrupts.CPU59.PMI:Performance_monitoring_interrupts
3615 +63.4% 5907 ± 22% interrupts.CPU59.RES:Rescheduling_interrupts
1144 ±104% +7945.0% 92075 ± 53% interrupts.CPU59.TLB:TLB_shootdowns
3821 ± 15% +1528.2% 62212 ± 38% interrupts.CPU6.CAL:Function_call_interrupts
3942 ± 39% +71.2% 6748 ± 9% interrupts.CPU6.NMI:Non-maskable_interrupts
3942 ± 39% +71.2% 6748 ± 9% interrupts.CPU6.PMI:Performance_monitoring_interrupts
3903 ± 4% +47.5% 5759 ± 22% interrupts.CPU6.RES:Rescheduling_interrupts
1496 ± 36% +3987.9% 61185 ± 39% interrupts.CPU6.TLB:TLB_shootdowns
2798 ± 22% +139.4% 6699 ± 21% interrupts.CPU60.NMI:Non-maskable_interrupts
2798 ± 22% +139.4% 6699 ± 21% interrupts.CPU60.PMI:Performance_monitoring_interrupts
2957 ± 17% +123.3% 6604 ± 22% interrupts.CPU61.NMI:Non-maskable_interrupts
2957 ± 17% +123.3% 6604 ± 22% interrupts.CPU61.PMI:Performance_monitoring_interrupts
3680 ± 37% +2380.7% 91308 ± 40% interrupts.CPU62.CAL:Function_call_interrupts
4022 ± 14% +37.9% 5547 ± 11% interrupts.CPU62.RES:Rescheduling_interrupts
1136 ±124% +7892.6% 90855 ± 42% interrupts.CPU62.TLB:TLB_shootdowns
3634 ± 36% +2919.3% 109735 ± 24% interrupts.CPU63.CAL:Function_call_interrupts
3521 ± 26% +114.0% 7538 ± 2% interrupts.CPU63.NMI:Non-maskable_interrupts
3521 ± 26% +114.0% 7538 ± 2% interrupts.CPU63.PMI:Performance_monitoring_interrupts
3948 ± 10% +65.9% 6551 ± 18% interrupts.CPU63.RES:Rescheduling_interrupts
1152 ±110% +9401.5% 109528 ± 24% interrupts.CPU63.TLB:TLB_shootdowns
3592 ± 34% +3034.8% 112602 ± 26% interrupts.CPU64.CAL:Function_call_interrupts
2961 ± 17% +148.8% 7369 ± 9% interrupts.CPU64.NMI:Non-maskable_interrupts
2961 ± 17% +148.8% 7369 ± 9% interrupts.CPU64.PMI:Performance_monitoring_interrupts
4193 ± 19% +66.8% 6996 ± 10% interrupts.CPU64.RES:Rescheduling_interrupts
1104 ±106% +10101.5% 112650 ± 27% interrupts.CPU64.TLB:TLB_shootdowns
3509 ± 33% +2502.5% 91320 ± 30% interrupts.CPU65.CAL:Function_call_interrupts
2965 ± 17% +145.6% 7283 ± 6% interrupts.CPU65.NMI:Non-maskable_interrupts
2965 ± 17% +145.6% 7283 ± 6% interrupts.CPU65.PMI:Performance_monitoring_interrupts
3859 ± 8% +52.9% 5899 ± 13% interrupts.CPU65.RES:Rescheduling_interrupts
1037 ±112% +8669.9% 91009 ± 31% interrupts.CPU65.TLB:TLB_shootdowns
3370 ± 10% +2036.3% 71994 ± 5% interrupts.CPU7.CAL:Function_call_interrupts
3469 ± 44% +72.9% 5999 ± 24% interrupts.CPU7.NMI:Non-maskable_interrupts
3469 ± 44% +72.9% 5999 ± 24% interrupts.CPU7.PMI:Performance_monitoring_interrupts
3809 ± 4% +47.7% 5625 ± 11% interrupts.CPU7.RES:Rescheduling_interrupts
988.50 ± 61% +7115.2% 71322 ± 6% interrupts.CPU7.TLB:TLB_shootdowns
3654 ± 4% +80.9% 6612 ± 24% interrupts.CPU77.RES:Rescheduling_interrupts
3569 ± 25% +2737.1% 101271 ± 22% interrupts.CPU8.CAL:Function_call_interrupts
3412 ± 47% +81.7% 6202 ± 26% interrupts.CPU8.NMI:Non-maskable_interrupts
3412 ± 47% +81.7% 6202 ± 26% interrupts.CPU8.PMI:Performance_monitoring_interrupts
3822 ± 6% +72.8% 6605 ± 13% interrupts.CPU8.RES:Rescheduling_interrupts
2371 ± 57% +4169.2% 101244 ± 23% interrupts.CPU8.TLB:TLB_shootdowns
3280 ± 25% +2403.5% 82113 ± 56% interrupts.CPU9.CAL:Function_call_interrupts
3406 ± 50% +101.8% 6873 ± 11% interrupts.CPU9.NMI:Non-maskable_interrupts
3406 ± 50% +101.8% 6873 ± 11% interrupts.CPU9.PMI:Performance_monitoring_interrupts
3636 ± 6% +66.1% 6041 ± 24% interrupts.CPU9.RES:Rescheduling_interrupts
898.75 ±100% +8966.6% 81486 ± 58% interrupts.CPU9.TLB:TLB_shootdowns
441700 ± 6% +18.5% 523360 ± 5% interrupts.NMI:Non-maskable_interrupts
441700 ± 6% +18.5% 523360 ± 5% interrupts.PMI:Performance_monitoring_interrupts
344450 ± 2% +42.8% 491951 ± 2% interrupts.RES:Rescheduling_interrupts
153414 ± 10% +3677.4% 5795118 ± 6% interrupts.TLB:TLB_shootdowns



stress-ng.numa.ops

3500 +-+------------------------------------------------------------------+
| O O |
3000 +-+ O O O OO O O |
O O O O O O O O OO |
2500 +-+ O O |
| O O O O OO O OO |
2000 +-+ O O |
| |
1500 +-+ |
| |
1000 +-+ |
| |
500 +-+ |
| |
0 +-+------------------------------------------------------------------+


stress-ng.numa.ops_per_sec

700 +-+-------------------------------------------------------------------+
| O |
600 +-+ O O O OO O O O |
O O O O O O O O O O |
500 +-+ O |
| O OO OO O O OO O |
400 +-+ O O |
| |
300 +-+ |
| |
200 +-+ |
| |
100 +-+ |
| |
0 +-+-------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample

***************************************************************************************************
vm-snb-8G: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
=========================================================================================
compiler/kconfig/rootfs/tbox_group/test/testcase:
gcc-7/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/vm-snb-8G/cve/ltp

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:12 8% 1:12 dmesg.BUG:soft_lockup-CPU##stuck_for#s
:12 83% 10:12 dmesg.Kernel_panic-not_syncing:System_is_deadlocked_on_memory
:12 8% 1:12 dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
:12 83% 10:12 dmesg.Out_of_memory:Kill_process
:12 83% 10:12 dmesg.Out_of_memory_and_no_killable_processes
:12 8% 1:12 dmesg.RIP:free_reserved_area
%stddev %change %stddev
\ | \
1.00 -100.0% 0.00 ltp.cve-2017-18075.pass
783.53 ± 40% +165.9% 2083 ltp.time.elapsed_time
783.53 ± 40% +165.9% 2083 ltp.time.elapsed_time.max
1171525 ± 21% -69.1% 362268 ltp.time.involuntary_context_switches
11374889 ± 27% -89.7% 1171686 ltp.time.minor_page_faults
103.08 ± 9% -46.6% 55.00 ltp.time.percent_of_cpu_this_job_got
199802 ± 12% -36.0% 127912 ltp.time.voluntary_context_switches
557835 ± 3% +469.2% 3175044 meminfo.Active
557602 ± 3% +469.4% 3174826 meminfo.Active(anon)
204797 -51.1% 100217 meminfo.CmaFree
709192 ± 3% -11.8% 625168 meminfo.Inactive
22617 ± 16% -50.8% 11135 meminfo.Inactive(anon)
4772734 -52.9% 2247863 meminfo.MemAvailable
4273603 -57.4% 1820264 meminfo.MemFree
3888307 +63.1% 6341643 meminfo.Memused



***************************************************************************************************
lkp-skl-4sp1: 192 threads Skylake-4S with 704G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-4sp1/futex1/will-it-scale

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
%stddev %change %stddev
\ | \
0.42 ± 41% -37.6% 0.27 ± 38% turbostat.CPU%c1
436529 +18.1% 515400 ± 2% meminfo.Active
432417 +18.2% 511272 ± 2% meminfo.Active(anon)
1.01 ±100% +1140.2% 12.50 ±141% irq_exception_noise.__do_page_fault.50th
1.10 ±100% +1059.7% 12.70 ±137% irq_exception_noise.__do_page_fault.60th
1.18 ±100% +1002.0% 13.00 ±133% irq_exception_noise.__do_page_fault.70th
1.32 ±100% +934.8% 13.71 ±123% irq_exception_noise.__do_page_fault.80th
545804 ± 35% -56.4% 238144 ± 29% numa-numastat.node0.local_node
568271 ± 32% -54.7% 257664 ± 24% numa-numastat.node0.numa_hit
232178 ± 44% +135.2% 546083 ± 30% numa-numastat.node2.local_node
268049 ± 38% +114.9% 576076 ± 27% numa-numastat.node2.numa_hit
330370 ± 19% -39.1% 201131 ± 50% numa-numastat.node3.local_node
342159 ± 15% -31.7% 233865 ± 42% numa-numastat.node3.numa_hit
108167 +18.2% 127807 ± 2% proc-vmstat.nr_active_anon
6541 -1.8% 6425 proc-vmstat.nr_inactive_anon
8474 -2.4% 8271 ± 2% proc-vmstat.nr_mapped
108167 +18.2% 127807 ± 2% proc-vmstat.nr_zone_active_anon
6541 -1.8% 6425 proc-vmstat.nr_zone_inactive_anon
29259 +3.5% 30283 proc-vmstat.pgactivate
1477807 -2.1% 1446297 proc-vmstat.pgfree
13.29 ± 5% -2.6 10.72 ± 4% perf-stat.i.cache-miss-rate%
1107530 ± 4% -16.6% 923800 ± 3% perf-stat.i.cache-misses
57.31 -2.0% 56.14 perf-stat.i.cpu-migrations
429491 ± 3% +14.9% 493355 ± 4% perf-stat.i.cycles-between-cache-misses
51.31 +10.9 62.22 ± 6% perf-stat.i.iTLB-load-miss-rate%
3.369e+08 +2.2% 3.443e+08 perf-stat.i.iTLB-load-misses
3.2e+08 -32.8% 2.149e+08 ± 16% perf-stat.i.iTLB-loads
863.04 -2.4% 842.12 perf-stat.i.instructions-per-iTLB-miss
355.57 ± 6% -13.9% 306.21 ± 5% sched_debug.cfs_rq:/.exec_clock.stddev
0.46 +163.6% 1.21 ± 10% sched_debug.cfs_rq:/.nr_spread_over.avg
619.50 ± 6% -13.9% 533.25 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.max
128891 +42.6% 183838 ± 30% sched_debug.cpu.avg_idle.stddev
11367 ± 2% +14.6% 13028 sched_debug.cpu.curr->pid.max
2547 ± 15% +30.3% 3321 sched_debug.cpu.curr->pid.min
-11.50 -23.9% -8.75 sched_debug.cpu.nr_uninterruptible.min
10647 ± 20% -40.5% 6331 ± 17% sched_debug.cpu.sched_goidle.max
800.32 ± 19% -31.8% 545.75 ± 5% sched_debug.cpu.sched_goidle.stddev
20627 +16.3% 23980 ± 6% softirqs.CPU0.RCU
112466 ± 4% -8.4% 102965 ± 4% softirqs.CPU0.TIMER
20826 +10.9% 23088 ± 6% softirqs.CPU110.RCU
20775 ± 8% +22.1% 25361 ± 5% softirqs.CPU112.RCU
20835 ± 10% +22.8% 25592 ± 5% softirqs.CPU113.RCU
21208 ± 6% +21.7% 25811 ± 4% softirqs.CPU114.RCU
20691 ± 7% +24.5% 25764 ± 6% softirqs.CPU115.RCU
20809 ± 8% +22.8% 25550 ± 5% softirqs.CPU116.RCU
20870 ± 9% +23.4% 25754 ± 4% softirqs.CPU117.RCU
20714 ± 8% +23.4% 25568 ± 5% softirqs.CPU118.RCU
20652 ± 8% +22.6% 25310 ± 5% softirqs.CPU119.RCU
20909 +10.8% 23171 ± 7% softirqs.CPU13.RCU
20488 +10.1% 22548 ± 5% softirqs.CPU14.RCU
20955 ± 6% +21.3% 25421 ± 6% softirqs.CPU16.RCU
19946 +15.2% 22982 ± 11% softirqs.CPU173.RCU
133691 ± 19% -22.1% 104181 ± 4% softirqs.CPU176.TIMER
21426 ± 3% +17.6% 25194 ± 4% softirqs.CPU18.RCU
20370 ± 6% +24.1% 25275 ± 3% softirqs.CPU19.RCU
21029 ± 4% +19.6% 25146 ± 5% softirqs.CPU20.RCU
20971 ± 8% +19.7% 25095 ± 4% softirqs.CPU21.RCU
20759 ± 6% +20.2% 24958 ± 5% softirqs.CPU22.RCU
20627 ± 6% +21.6% 25077 ± 5% softirqs.CPU23.RCU
20693 +10.5% 22867 ± 6% softirqs.CPU5.RCU
18951 ± 4% +21.1% 22953 ± 12% softirqs.CPU75.RCU
110575 ± 4% -7.5% 102329 ± 2% softirqs.CPU98.TIMER
57149 +37.8% 78776 ± 10% numa-vmstat.node0.nr_file_pages
1470 +84.0% 2706 ± 37% numa-vmstat.node0.nr_mapped
1020 ± 74% +1687.7% 18243 ± 45% numa-vmstat.node0.nr_shmem
16.50 ± 33% +8503.0% 1419 ±122% numa-vmstat.node1.nr_inactive_anon
6205 +19.9% 7437 ± 10% numa-vmstat.node1.nr_kernel_stack
594.50 ± 3% +141.9% 1438 ± 52% numa-vmstat.node1.nr_page_table_pages
56.00 ± 33% +2705.4% 1571 ±112% numa-vmstat.node1.nr_shmem
3072 ± 5% +119.8% 6751 ± 17% numa-vmstat.node1.nr_slab_reclaimable
11562 ± 2% +31.6% 15218 ± 9% numa-vmstat.node1.nr_slab_unreclaimable
16.50 ± 33% +8503.0% 1419 ±122% numa-vmstat.node1.nr_zone_inactive_anon
266341 ± 12% +131.7% 617001 ± 9% numa-vmstat.node1.numa_hit
140430 ± 23% +258.4% 503291 ± 12% numa-vmstat.node1.numa_local
1028 -100.0% 0.00 numa-vmstat.node2.nr_active_file
71145 -19.0% 57616 ± 2% numa-vmstat.node2.nr_file_pages
4668 ± 12% -67.9% 1500 ±104% numa-vmstat.node2.nr_inactive_anon
544.50 -100.0% 0.00 numa-vmstat.node2.nr_inactive_file
9199 ± 4% -28.3% 6593 ± 5% numa-vmstat.node2.nr_kernel_stack
3552 -50.5% 1758 ± 33% numa-vmstat.node2.nr_mapped
2955 ± 6% -65.5% 1018 ± 38% numa-vmstat.node2.nr_page_table_pages
4996 ± 11% -65.4% 1727 ± 95% numa-vmstat.node2.nr_shmem
64607 -13.5% 55888 numa-vmstat.node2.nr_unevictable
1028 -100.0% 0.00 numa-vmstat.node2.nr_zone_active_file
4668 ± 12% -67.9% 1500 ±104% numa-vmstat.node2.nr_zone_inactive_anon
544.50 -100.0% 0.00 numa-vmstat.node2.nr_zone_inactive_file
64607 -13.5% 55888 numa-vmstat.node2.nr_zone_unevictable
31367 ± 22% -70.8% 9145 ±104% numa-vmstat.node3.nr_active_anon
11913 ± 59% -87.6% 1476 ± 52% numa-vmstat.node3.nr_anon_pages
1000 ± 17% -72.0% 279.75 ±150% numa-vmstat.node3.nr_inactive_anon
698.00 ± 11% -15.7% 588.25 numa-vmstat.node3.nr_page_table_pages
6430 ± 14% -39.1% 3918 ± 20% numa-vmstat.node3.nr_slab_reclaimable
31366 ± 22% -70.8% 9145 ±104% numa-vmstat.node3.nr_zone_active_anon
1000 ± 17% -72.0% 279.75 ±150% numa-vmstat.node3.nr_zone_inactive_anon
345814 ± 42% -45.4% 188830 ± 51% numa-vmstat.node3.numa_local
228598 +37.8% 315090 ± 10% numa-meminfo.node0.FilePages
3152 ± 89% +354.3% 14320 ± 50% numa-meminfo.node0.Inactive
5883 +80.6% 10625 ± 34% numa-meminfo.node0.Mapped
4082 ± 74% +1687.1% 72956 ± 45% numa-meminfo.node0.Shmem
68.50 ± 31% +8993.8% 6229 ±107% numa-meminfo.node1.Inactive
68.50 ± 31% +8191.2% 5679 ±122% numa-meminfo.node1.Inactive(anon)
12286 ± 5% +119.8% 27005 ± 17% numa-meminfo.node1.KReclaimable
6207 +19.9% 7441 ± 10% numa-meminfo.node1.KernelStack
2372 ± 3% +142.7% 5757 ± 52% numa-meminfo.node1.PageTables
12286 ± 5% +119.8% 27005 ± 17% numa-meminfo.node1.SReclaimable
46242 ± 2% +31.6% 60873 ± 9% numa-meminfo.node1.SUnreclaim
223.50 ± 33% +2711.9% 6284 ±112% numa-meminfo.node1.Shmem
58528 +50.1% 87879 ± 11% numa-meminfo.node1.Slab
4112 -100.0% 0.00 numa-meminfo.node2.Active(file)
284580 -19.0% 230466 ± 2% numa-meminfo.node2.FilePages
20854 ± 11% -71.2% 6004 ±104% numa-meminfo.node2.Inactive
18673 ± 12% -67.8% 6004 ±104% numa-meminfo.node2.Inactive(anon)
2180 -100.0% 0.00 numa-meminfo.node2.Inactive(file)
9203 ± 4% -28.3% 6594 ± 5% numa-meminfo.node2.KernelStack
13712 -48.6% 7054 ± 33% numa-meminfo.node2.Mapped
883222 ± 6% -16.2% 740514 ± 13% numa-meminfo.node2.MemUsed
11829 ± 6% -65.8% 4040 ± 38% numa-meminfo.node2.PageTables
19985 ± 11% -65.4% 6911 ± 95% numa-meminfo.node2.Shmem
258431 -13.5% 223554 numa-meminfo.node2.Unevictable
125256 ± 22% -70.8% 36547 ±104% numa-meminfo.node3.Active
125256 ± 22% -70.8% 36547 ±104% numa-meminfo.node3.Active(anon)
24571 ± 83% -96.0% 974.75 ±173% numa-meminfo.node3.AnonHugePages
47677 ± 59% -87.7% 5876 ± 52% numa-meminfo.node3.AnonPages
3993 ± 18% -72.1% 1115 ±149% numa-meminfo.node3.Inactive
3991 ± 18% -72.1% 1115 ±149% numa-meminfo.node3.Inactive(anon)
25714 ± 14% -39.1% 15671 ± 20% numa-meminfo.node3.KReclaimable
696078 ± 2% -14.9% 592511 ± 8% numa-meminfo.node3.MemUsed
2772 ± 11% -15.9% 2332 numa-meminfo.node3.PageTables
25714 ± 14% -39.1% 15671 ± 20% numa-meminfo.node3.SReclaimable
78860 ± 8% -16.7% 65670 ± 6% numa-meminfo.node3.Slab
1702 ± 25% -44.9% 937.50 ± 17% interrupts.CPU0.RES:Rescheduling_interrupts
6102 ± 5% -3.6% 5884 ± 5% interrupts.CPU100.CAL:Function_call_interrupts
6091 ± 5% -4.1% 5842 ± 5% interrupts.CPU101.CAL:Function_call_interrupts
6100 ± 5% -4.3% 5841 ± 6% interrupts.CPU102.CAL:Function_call_interrupts
6099 ± 5% -3.6% 5882 ± 5% interrupts.CPU103.CAL:Function_call_interrupts
6099 ± 5% -3.5% 5883 ± 5% interrupts.CPU104.CAL:Function_call_interrupts
6099 ± 5% -3.6% 5882 ± 5% interrupts.CPU106.CAL:Function_call_interrupts
6119 ± 5% -3.8% 5886 ± 5% interrupts.CPU107.CAL:Function_call_interrupts
6115 ± 5% -8.8% 5575 ± 4% interrupts.CPU108.CAL:Function_call_interrupts
6115 ± 5% -3.6% 5894 ± 4% interrupts.CPU109.CAL:Function_call_interrupts
6118 ± 5% -3.9% 5879 ± 5% interrupts.CPU110.CAL:Function_call_interrupts
6117 ± 5% -3.8% 5884 ± 4% interrupts.CPU111.CAL:Function_call_interrupts
291.50 ± 86% -90.1% 29.00 ± 33% interrupts.CPU111.RES:Rescheduling_interrupts
6117 ± 5% -3.6% 5897 ± 5% interrupts.CPU112.CAL:Function_call_interrupts
6117 ± 5% -3.5% 5902 ± 5% interrupts.CPU113.CAL:Function_call_interrupts
6116 ± 5% -3.5% 5904 ± 5% interrupts.CPU114.CAL:Function_call_interrupts
6116 ± 5% -3.6% 5897 ± 5% interrupts.CPU115.CAL:Function_call_interrupts
6117 ± 5% -3.6% 5897 ± 5% interrupts.CPU116.CAL:Function_call_interrupts
6118 ± 5% -3.6% 5896 ± 5% interrupts.CPU117.CAL:Function_call_interrupts
6088 ± 5% -5.7% 5744 interrupts.CPU12.CAL:Function_call_interrupts
1079 ± 78% -77.3% 245.50 ±134% interrupts.CPU120.RES:Rescheduling_interrupts
949.50 ± 91% -93.4% 63.00 ± 35% interrupts.CPU122.RES:Rescheduling_interrupts
6099 ± 5% -3.6% 5879 ± 4% interrupts.CPU13.CAL:Function_call_interrupts
2647 ± 18% -82.0% 477.00 ± 33% interrupts.CPU13.RES:Rescheduling_interrupts
208.50 ± 51% -51.3% 101.50 ± 36% interrupts.CPU14.RES:Rescheduling_interrupts
53.00 ± 50% +502.4% 319.25 ± 68% interrupts.CPU140.RES:Rescheduling_interrupts
20.00 ± 55% +392.5% 98.50 ± 35% interrupts.CPU141.RES:Rescheduling_interrupts
6093 ± 5% -4.2% 5838 ± 5% interrupts.CPU144.CAL:Function_call_interrupts
206.00 ± 59% -63.7% 74.75 ± 75% interrupts.CPU145.RES:Rescheduling_interrupts
6100 ± 5% -3.9% 5865 ± 5% interrupts.CPU15.CAL:Function_call_interrupts
8.00 +575.0% 54.00 ± 60% interrupts.CPU152.RES:Rescheduling_interrupts
969.00 ± 67% -91.8% 79.25 ± 56% interrupts.CPU153.RES:Rescheduling_interrupts
922.00 ± 91% -74.7% 233.50 ±134% interrupts.CPU159.RES:Rescheduling_interrupts
6098 ± 5% -3.9% 5863 ± 5% interrupts.CPU16.CAL:Function_call_interrupts
6097 ± 5% -4.2% 5841 ± 4% interrupts.CPU17.CAL:Function_call_interrupts
1077 ± 89% -94.5% 58.75 ± 78% interrupts.CPU176.RES:Rescheduling_interrupts
710144 ± 12% -13.5% 614236 interrupts.CPU182.LOC:Local_timer_interrupts
303.50 ± 78% -92.6% 22.50 ± 50% interrupts.CPU186.RES:Rescheduling_interrupts
6096 ± 5% -6.4% 5704 interrupts.CPU19.CAL:Function_call_interrupts
271.00 ± 71% -86.6% 36.25 ± 81% interrupts.CPU190.RES:Rescheduling_interrupts
6109 ± 5% -3.8% 5875 ± 5% interrupts.CPU26.CAL:Function_call_interrupts
6114 ± 5% -4.1% 5864 ± 5% interrupts.CPU27.CAL:Function_call_interrupts
6112 ± 5% -4.1% 5864 ± 5% interrupts.CPU28.CAL:Function_call_interrupts
6085 ± 6% -3.7% 5859 ± 5% interrupts.CPU29.CAL:Function_call_interrupts
780665 ± 20% -21.3% 614413 interrupts.CPU3.LOC:Local_timer_interrupts
1969 ± 90% -88.1% 235.00 ±106% interrupts.CPU3.RES:Rescheduling_interrupts
6085 ± 6% -3.8% 5855 ± 5% interrupts.CPU30.CAL:Function_call_interrupts
6079 ± 6% -3.7% 5853 ± 5% interrupts.CPU31.CAL:Function_call_interrupts
6080 ± 6% -3.8% 5847 ± 5% interrupts.CPU32.CAL:Function_call_interrupts
6077 ± 6% -3.7% 5850 ± 5% interrupts.CPU33.CAL:Function_call_interrupts
806.00 ± 11% -73.2% 216.00 ± 74% interrupts.CPU35.RES:Rescheduling_interrupts
1731 ± 3% -83.6% 283.75 ± 58% interrupts.CPU37.RES:Rescheduling_interrupts
45.50 ± 45% +1009.3% 504.75 ± 86% interrupts.CPU40.RES:Rescheduling_interrupts
41.50 ± 20% +222.3% 133.75 ± 47% interrupts.CPU44.RES:Rescheduling_interrupts
86.50 ± 26% +414.7% 445.25 ± 45% interrupts.CPU49.RES:Rescheduling_interrupts
129.50 ± 30% +1750.4% 2396 ± 94% interrupts.CPU53.RES:Rescheduling_interrupts
239.50 ± 51% +291.6% 938.00 ± 34% interrupts.CPU56.RES:Rescheduling_interrupts
6153 ± 5% -4.4% 5881 ± 5% interrupts.CPU58.CAL:Function_call_interrupts
87.00 ± 8% +1240.2% 1166 ± 58% interrupts.CPU63.RES:Rescheduling_interrupts
244.50 ± 29% -51.7% 118.00 ± 30% interrupts.CPU69.RES:Rescheduling_interrupts
1242 ± 64% -82.3% 220.00 ± 33% interrupts.CPU75.RES:Rescheduling_interrupts
173.50 ± 18% -43.4% 98.25 ± 10% interrupts.CPU76.RES:Rescheduling_interrupts
2162 ± 60% -76.4% 510.50 ±123% interrupts.CPU77.RES:Rescheduling_interrupts
74.00 ± 5% +285.5% 285.25 ± 45% interrupts.CPU8.RES:Rescheduling_interrupts
633.00 ± 34% -56.6% 274.50 ± 97% interrupts.CPU82.RES:Rescheduling_interrupts
6229 ± 7% -5.3% 5898 ± 5% interrupts.CPU84.CAL:Function_call_interrupts
6123 ± 5% -3.6% 5900 ± 5% interrupts.CPU86.CAL:Function_call_interrupts
943284 ± 34% -34.8% 614662 interrupts.CPU86.LOC:Local_timer_interrupts
6122 ± 5% -3.6% 5899 ± 5% interrupts.CPU87.CAL:Function_call_interrupts
6120 ± 5% -3.6% 5898 ± 5% interrupts.CPU88.CAL:Function_call_interrupts
6118 ± 5% -3.5% 5902 ± 5% interrupts.CPU89.CAL:Function_call_interrupts
1286 ± 83% -66.7% 428.50 ±118% interrupts.CPU9.RES:Rescheduling_interrupts
6114 ± 5% -3.5% 5900 ± 5% interrupts.CPU91.CAL:Function_call_interrupts
6118 ± 5% -3.4% 5908 ± 5% interrupts.CPU92.CAL:Function_call_interrupts
6117 ± 5% -3.4% 5907 ± 5% interrupts.CPU93.CAL:Function_call_interrupts
6118 ± 5% -3.4% 5907 ± 5% interrupts.CPU94.CAL:Function_call_interrupts
6074 ± 5% -3.8% 5841 ± 5% interrupts.CPU95.CAL:Function_call_interrupts
8980 ± 11% -9.6% 8117 ± 4% interrupts.CPU95.RES:Rescheduling_interrupts
6116 ± 5% -3.8% 5882 ± 5% interrupts.CPU96.CAL:Function_call_interrupts
6116 ± 5% -5.4% 5786 ± 2% interrupts.CPU97.CAL:Function_call_interrupts
6115 ± 5% -3.7% 5890 ± 5% interrupts.CPU99.CAL:Function_call_interrupts
846020 ± 26% -27.6% 612307 interrupts.CPU99.LOC:Local_timer_interrupts
311.00 -34.5% 203.75 ± 44% interrupts.TLB:TLB_shootdowns



***************************************************************************************************
lkp-knm02: 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime:
cpu/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-knm02/stress-ng/1s

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
%stddev %change %stddev
\ | \
66.75 ± 8% +260.3% 240.50 ± 3% stress-ng.numa.ops
46.23 ± 5% +401.7% 231.92 ± 3% stress-ng.numa.ops_per_sec
44912 ± 3% +6.8% 47988 stress-ng.time.voluntary_context_switches
34176 ± 8% -38.7% 20951 ± 11% numa-numastat.node1.numa_hit
34176 ± 8% -38.7% 20951 ± 11% numa-numastat.node1.other_node
162.25 ± 13% -97.5% 4.00 ±106% numa-vmstat.node0.nr_isolated_anon
180.50 ± 13% -100.0% 0.00 numa-vmstat.node1.nr_isolated_anon
22.10 ± 5% -21.7% 17.31 ± 4% perf-stat.i.MPKI
14.96 -4.5% 14.28 perf-stat.overall.MPKI
7.58 +0.1 7.65 perf-stat.overall.branch-miss-rate%
18.33 +0.4 18.73 perf-stat.overall.cache-miss-rate%
1831 +2.2% 1871 perf-stat.overall.cycles-between-cache-misses
1.46 ± 33% -1.1 0.32 ±103% perf-profile.calltrace.cycles-pp.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock.vprintk_emit
1.44 ± 35% -0.8 0.63 ± 78% perf-profile.calltrace.cycles-pp.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock
1.70 ± 15% -0.6 1.11 ± 62% perf-profile.calltrace.cycles-pp.run_rebalance_domains.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
0.55 ± 66% +2.6 3.12 ± 75% perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__enable
0.55 ± 66% +2.6 3.13 ± 74% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__enable.cmd_record.run_builtin
0.55 ± 66% +2.6 3.13 ± 74% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__ioctl.perf_evlist__enable.cmd_record
2626 ± 10% +17.0% 3072 ± 6% slabinfo.avtab_node.active_objs
2626 ± 10% +17.0% 3072 ± 6% slabinfo.avtab_node.num_objs
44473 +7.0% 47586 ± 4% slabinfo.filp.active_objs
44478 +7.1% 47622 ± 4% slabinfo.filp.num_objs
828.75 ± 9% -14.2% 711.00 ± 4% slabinfo.skbuff_fclone_cache.active_objs
828.75 ± 9% -14.2% 711.00 ± 4% slabinfo.skbuff_fclone_cache.num_objs
431.00 ± 30% -99.0% 4.50 ±101% proc-vmstat.nr_isolated_anon
34176 ± 8% -38.7% 20951 ± 11% proc-vmstat.numa_other
23394485 ± 9% -74.5% 5964109 ± 40% proc-vmstat.pgalloc_normal
23265642 ± 9% -75.0% 5825344 ± 41% proc-vmstat.pgfree
592.75 ± 15% +472.5% 3393 ± 8% proc-vmstat.pgmigrate_fail
51229 ± 8% -86.2% 7061 ± 15% proc-vmstat.pgmigrate_success
40681 ± 10% -83.7% 6633 ± 71% proc-vmstat.thp_deferred_split_page
497.44 ± 3% -16.4% 415.64 ± 4% sched_debug.cfs_rq:/.exec_clock.stddev
4.59 ± 20% -24.7% 3.45 ± 10% sched_debug.cfs_rq:/.load_avg.avg
321.12 ± 46% -63.6% 116.75 ± 20% sched_debug.cfs_rq:/.load_avg.max
23.22 ± 35% -51.5% 11.26 ± 12% sched_debug.cfs_rq:/.load_avg.stddev
14.50 ± 22% +62.9% 23.62 ± 23% sched_debug.cfs_rq:/.nr_spread_over.max
1.68 ± 12% +26.8% 2.13 ± 18% sched_debug.cfs_rq:/.nr_spread_over.stddev
1078 ± 5% +17.5% 1267 ± 9% sched_debug.cfs_rq:/.util_avg.max
9496 ± 18% -15.6% 8019 ± 4% softirqs.CPU0.RCU
40692 ± 5% -9.7% 36753 ± 2% softirqs.CPU0.TIMER
40960 ± 5% -9.7% 36968 ± 5% softirqs.CPU10.TIMER
42943 ± 5% -7.9% 39563 ± 5% softirqs.CPU114.TIMER
7967 ± 15% -14.7% 6795 ± 5% softirqs.CPU128.RCU
44489 ± 5% -13.2% 38635 ± 5% softirqs.CPU128.TIMER
43555 ± 7% -13.6% 37648 ± 5% softirqs.CPU144.TIMER
7955 ± 16% -15.9% 6693 ± 4% softirqs.CPU145.RCU
43111 ± 7% -11.9% 37982 ± 4% softirqs.CPU160.TIMER
7755 ± 18% -14.0% 6667 ± 6% softirqs.CPU186.RCU
43744 ± 6% -7.9% 40296 ± 5% softirqs.CPU187.TIMER
41152 ± 7% -9.2% 37362 ± 4% softirqs.CPU190.TIMER
43610 ± 3% -11.9% 38408 ± 5% softirqs.CPU198.TIMER
42378 ± 7% -11.4% 37538 ± 4% softirqs.CPU240.TIMER
8735 ± 18% -21.6% 6845 ± 8% softirqs.CPU262.RCU
37455 ± 4% -7.7% 34575 ± 3% softirqs.CPU270.TIMER
8857 ± 20% -25.2% 6622 ± 5% softirqs.CPU271.RCU
40008 ± 5% -9.7% 36122 ± 2% softirqs.CPU32.TIMER
45805 ± 22% -17.4% 37825 ± 4% softirqs.CPU37.TIMER
40227 ± 4% -10.2% 36125 ± 4% softirqs.CPU55.TIMER
44908 ± 6% -12.3% 39385 ± 5% softirqs.CPU64.TIMER
12303 ± 8% -24.0% 9348 ± 18% softirqs.CPU69.RCU
51650 ± 4% -17.8% 42473 ± 2% softirqs.CPU69.TIMER
44028 ± 6% -9.6% 39794 ± 3% softirqs.CPU86.TIMER
2009 ± 12% -24.9% 1509 ± 17% interrupts.CPU0.RES:Rescheduling_interrupts
956.00 ±136% -91.7% 79.00 ± 24% interrupts.CPU10.RES:Rescheduling_interrupts
58.50 ± 38% +127.4% 133.00 ± 42% interrupts.CPU101.RES:Rescheduling_interrupts
33.25 ± 29% +226.3% 108.50 ± 94% interrupts.CPU104.RES:Rescheduling_interrupts
7122 ± 3% +10.9% 7897 ± 2% interrupts.CPU115.CAL:Function_call_interrupts
56.25 ± 27% +98.2% 111.50 ± 62% interrupts.CPU118.RES:Rescheduling_interrupts
446.25 ± 38% -48.8% 228.50 ± 18% interrupts.CPU118.TLB:TLB_shootdowns
7016 ± 5% +11.8% 7841 interrupts.CPU124.CAL:Function_call_interrupts
241.00 ±120% -77.3% 54.75 ± 23% interrupts.CPU124.RES:Rescheduling_interrupts
103.00 ± 41% -54.6% 46.75 ± 28% interrupts.CPU125.RES:Rescheduling_interrupts
37.00 ± 8% +303.4% 149.25 ± 76% interrupts.CPU156.RES:Rescheduling_interrupts
66.25 ± 81% +108.3% 138.00 ± 64% interrupts.CPU162.RES:Rescheduling_interrupts
34.75 ± 25% +656.1% 262.75 ±116% interrupts.CPU168.RES:Rescheduling_interrupts
117.25 ± 20% -42.9% 67.00 ± 29% interrupts.CPU193.RES:Rescheduling_interrupts
8393 ± 3% +11.5% 9359 ± 4% interrupts.CPU205.CAL:Function_call_interrupts
87.75 ± 39% -44.7% 48.50 ± 28% interrupts.CPU205.RES:Rescheduling_interrupts
6863 ± 5% +12.5% 7720 ± 2% interrupts.CPU21.CAL:Function_call_interrupts
47.25 ± 9% +66.7% 78.75 ± 27% interrupts.CPU224.RES:Rescheduling_interrupts
47.50 ± 33% +81.1% 86.00 ± 29% interrupts.CPU232.RES:Rescheduling_interrupts
32.25 ± 14% +106.2% 66.50 ± 50% interrupts.CPU241.RES:Rescheduling_interrupts
328.25 ± 25% +167.6% 878.50 ± 60% interrupts.CPU248.TLB:TLB_shootdowns
36.75 ± 43% +105.4% 75.50 ± 40% interrupts.CPU256.RES:Rescheduling_interrupts
377.75 ± 20% +29.8% 490.25 ± 15% interrupts.CPU264.TLB:TLB_shootdowns
73.50 ± 45% +204.8% 224.00 ± 61% interrupts.CPU28.RES:Rescheduling_interrupts
79.25 ± 37% +76.3% 139.75 ± 35% interrupts.CPU32.RES:Rescheduling_interrupts
6892 ± 6% +14.9% 7919 ± 3% interrupts.CPU37.CAL:Function_call_interrupts
6892 ± 4% +16.9% 8055 ± 8% interrupts.CPU39.CAL:Function_call_interrupts
51.75 ± 15% +68.1% 87.00 ± 34% interrupts.CPU39.RES:Rescheduling_interrupts
476.00 ± 17% +102.9% 966.00 ± 31% interrupts.CPU40.TLB:TLB_shootdowns
6847 ± 5% +12.6% 7711 ± 2% interrupts.CPU41.CAL:Function_call_interrupts
6853 ± 4% +10.9% 7603 ± 3% interrupts.CPU48.CAL:Function_call_interrupts
522.50 ± 27% -48.1% 271.00 ± 74% interrupts.CPU54.TLB:TLB_shootdowns
454.25 ± 29% +85.5% 842.75 ± 49% interrupts.CPU60.TLB:TLB_shootdowns
300.00 ± 32% +148.1% 744.25 ± 55% interrupts.CPU61.TLB:TLB_shootdowns
52.75 ± 23% +130.3% 121.50 ± 50% interrupts.CPU64.RES:Rescheduling_interrupts
52.00 ± 22% +122.6% 115.75 ± 62% interrupts.CPU70.RES:Rescheduling_interrupts
6972 ± 5% +8.9% 7595 ± 5% interrupts.CPU77.CAL:Function_call_interrupts
6923 ± 4% +10.5% 7650 ± 2% interrupts.CPU8.CAL:Function_call_interrupts
7233 ± 5% +11.3% 8049 ± 2% interrupts.CPU81.CAL:Function_call_interrupts
585.25 ± 10% -37.6% 365.00 ± 27% interrupts.CPU88.TLB:TLB_shootdowns



***************************************************************************************************
lkp-knm02: 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime:
pipe/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-knm02/stress-ng/1s

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#]
:4 25% 1:4 dmesg.RIP:get_gup_pin_page
1:4 -25% :4 dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
:4 25% 1:4 dmesg.WARNING:at_mm/gup.c:#get_gup_pin_page
1:4 -25% :4 dmesg.WARNING:stack_recursion
%stddev %change %stddev
\ | \
18012 -10.5% 16117 ± 8% stress-ng.time.percent_of_cpu_this_job_got
1592 -4.9% 1513 stress-ng.time.system_time
3995345 -47.5% 2099227 stress-ng.vm-splice.ops
4001986 -18.5% 3259638 ± 7% stress-ng.vm-splice.ops_per_sec
0.24 ±173% +1.3 1.53 ± 44% perf-profile.calltrace.cycles-pp.cpu_load_update.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
185055 ± 59% +38.4% 256144 ± 49% meminfo.Active
184967 ± 59% +38.4% 256057 ± 49% meminfo.Active(anon)
185946 ± 58% +40.8% 261876 ± 50% numa-meminfo.node0.Active
185858 ± 58% +40.9% 261789 ± 50% numa-meminfo.node0.Active(anon)
17098 ± 6% -11.5% 15128 ± 3% numa-meminfo.node1.SUnreclaim
47648 ± 62% +42.2% 67747 ± 51% numa-vmstat.node0.nr_active_anon
47622 ± 62% +42.2% 67725 ± 51% numa-vmstat.node0.nr_zone_active_anon
4274 ± 6% -11.5% 3782 ± 3% numa-vmstat.node1.nr_slab_unreclaimable
24880 +0.9% 25102 proc-vmstat.nr_slab_reclaimable
18690590 +1.3% 18925901 proc-vmstat.numa_hit
18690590 +1.3% 18925901 proc-vmstat.numa_local
10842 ± 3% +7.3% 11637 ± 5% softirqs.CPU13.TIMER
14451 ± 26% -26.5% 10615 ± 5% softirqs.CPU18.TIMER
10649 ± 5% +15.8% 12333 ± 10% softirqs.CPU56.TIMER
3.37 ± 7% -44.3% 1.88 ± 33% sched_debug.cfs_rq:/.exec_clock.avg
22.72 ± 6% -29.9% 15.92 ± 26% sched_debug.cfs_rq:/.exec_clock.stddev
5.80 ± 34% -40.9% 3.43 ± 11% sched_debug.cfs_rq:/.load_avg.avg
1020 -51.2% 498.00 ± 6% sched_debug.cfs_rq:/.util_est_enqueued.max
78.11 ± 12% -33.3% 52.06 ± 21% sched_debug.cfs_rq:/.util_est_enqueued.stddev
7243 ± 12% +79.0% 12964 ± 17% sched_debug.cpu.avg_idle.min
2420 ± 18% -19.7% 1943 perf-stat.i.cycles-between-cache-misses
137.18 ± 5% +16.2% 159.47 ± 6% perf-stat.i.instructions-per-iTLB-miss
1969 ± 4% -7.2% 1827 ± 2% perf-stat.overall.cycles-between-cache-misses
0.90 ± 3% -0.0 0.85 perf-stat.overall.iTLB-load-miss-rate%
111.89 ± 2% +4.8% 117.28 perf-stat.overall.instructions-per-iTLB-miss
1.822e+11 -12.8% 1.589e+11 ± 11% perf-stat.ps.cpu-cycles
10509 ± 3% -12.6% 9183 ± 10% perf-stat.ps.minor-faults
457963 ± 2% -12.0% 402874 ± 11% perf-stat.ps.msec
10647 ± 4% -13.0% 9267 ± 10% perf-stat.ps.page-faults
1840 ± 34% -35.6% 1184 ± 19% interrupts.CPU12.RES:Rescheduling_interrupts
792.50 ± 7% +33.7% 1059 ± 17% interrupts.CPU126.RES:Rescheduling_interrupts
743.00 ± 50% +119.2% 1628 ± 20% interrupts.CPU154.RES:Rescheduling_interrupts
1490 ± 9% -37.5% 931.50 ± 21% interrupts.CPU16.RES:Rescheduling_interrupts
880.50 ± 30% +70.6% 1502 ± 10% interrupts.CPU194.RES:Rescheduling_interrupts
931.00 ± 23% +49.5% 1391 ± 18% interrupts.CPU216.RES:Rescheduling_interrupts
1158 ± 11% -31.0% 798.75 ± 30% interrupts.CPU236.RES:Rescheduling_interrupts
1253 ± 21% -32.7% 843.00 ± 31% interrupts.CPU241.RES:Rescheduling_interrupts
1700 ± 37% -58.2% 710.00 ± 48% interrupts.CPU243.RES:Rescheduling_interrupts
1399 ± 20% -40.4% 834.25 ± 35% interrupts.CPU255.RES:Rescheduling_interrupts
1064 ± 19% +32.0% 1405 ± 7% interrupts.CPU256.RES:Rescheduling_interrupts
996.75 ± 24% +32.0% 1316 ± 11% interrupts.CPU262.RES:Rescheduling_interrupts
1129 ± 22% +46.2% 1652 ± 16% interrupts.CPU267.RES:Rescheduling_interrupts
1553 ± 19% -34.4% 1019 ± 25% interrupts.CPU39.RES:Rescheduling_interrupts
1110 ± 5% +25.3% 1390 ± 14% interrupts.CPU40.RES:Rescheduling_interrupts
1073 ± 18% +45.5% 1562 ± 4% interrupts.CPU41.RES:Rescheduling_interrupts
1478 ± 20% -35.3% 956.50 ± 22% interrupts.CPU47.RES:Rescheduling_interrupts
1554 ± 7% -29.6% 1094 ± 24% interrupts.CPU54.RES:Rescheduling_interrupts
1613 ± 18% -37.6% 1007 ± 38% interrupts.CPU57.RES:Rescheduling_interrupts
946.25 ± 15% +45.7% 1379 ± 10% interrupts.CPU64.RES:Rescheduling_interrupts
780.00 ± 31% +81.5% 1415 ± 12% interrupts.CPU84.RES:Rescheduling_interrupts



***************************************************************************************************
lkp-bdw-ep3b: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3b/futex2/will-it-scale/0xb00002e

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
%stddev %change %stddev
\ | \
26674 ± 17% -22.3% 20716 ± 3% softirqs.CPU30.RCU
26755 ± 20% -21.3% 21064 ± 3% softirqs.CPU59.RCU
295164 +11.4% 328898 meminfo.Active
294996 +11.4% 328730 meminfo.Active(anon)
295368 -25.0% 221640 ± 14% meminfo.DirectMap4k
24.48 +0.2 24.66 perf-profile.calltrace.cycles-pp.get_futex_key.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
14.88 +0.2 15.09 perf-profile.calltrace.cycles-pp.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wait_setup.futex_wait
50.45 +0.2 50.67 perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
73752 +11.4% 82177 proc-vmstat.nr_active_anon
73752 +11.4% 82177 proc-vmstat.nr_zone_active_anon
710645 -2.3% 694065 proc-vmstat.pgfree
2.108e+08 ± 15% -53.8% 97334654 ± 18% cpuidle.C6.time
262362 ± 3% -54.2% 120108 ± 14% cpuidle.C6.usage
27692 ± 77% -79.8% 5606 ± 27% cpuidle.POLL.time
4498 ± 40% -44.2% 2509 ± 8% cpuidle.POLL.usage
362291 -41.5% 211769 ± 18% numa-numastat.node0.local_node
379384 -42.7% 217528 ± 19% numa-numastat.node0.numa_hit
17093 -66.3% 5760 ±120% numa-numastat.node0.other_node
289553 +50.8% 436751 ± 9% numa-numastat.node1.local_node
289591 +54.7% 448130 ± 9% numa-numastat.node1.numa_hit
250685 ± 3% -56.5% 108986 ± 17% turbostat.C6
0.77 ± 16% -0.4 0.34 ± 20% turbostat.C6%
0.55 ± 24% -58.3% 0.23 ± 32% turbostat.CPU%c6
0.35 ± 23% -54.2% 0.16 ± 55% turbostat.Pkg%pc2
0.03 ± 66% -91.7% 0.00 ±173% turbostat.Pkg%pc6
34.58 ± 11% +30.4% 45.08 ± 4% sched_debug.cfs_rq:/.nr_spread_over.max
3.94 ± 7% +25.1% 4.92 ± 5% sched_debug.cfs_rq:/.nr_spread_over.stddev
2.39 ± 2% +183.0% 6.78 ±106% sched_debug.cpu.cpu_load[0].stddev
24.58 +107.3% 50.96 ± 62% sched_debug.cpu.cpu_load[1].max
2.26 ± 6% +118.1% 4.94 ± 64% sched_debug.cpu.cpu_load[1].stddev
25212 ± 22% -25.8% 18696 ± 27% sched_debug.cpu.sched_count.max
125453 -33.3% 83670 ± 19% numa-meminfo.node0.AnonHugePages
168582 -21.2% 132815 ± 14% numa-meminfo.node0.AnonPages
124539 +18.2% 147258 ± 4% numa-meminfo.node1.Active
124539 +18.2% 147216 ± 4% numa-meminfo.node1.Active(anon)
43851 ± 4% +99.6% 87509 ± 18% numa-meminfo.node1.AnonHugePages
75101 ± 2% +48.9% 111789 ± 16% numa-meminfo.node1.AnonPages
2816 ± 2% +22.4% 3447 ± 17% numa-meminfo.node1.PageTables
54950 +11.2% 61121 ± 7% numa-meminfo.node1.SUnreclaim
42139 -21.2% 33199 ± 14% numa-vmstat.node0.nr_anon_pages
711043 -19.2% 574713 ± 14% numa-vmstat.node0.numa_hit
693926 -18.0% 568865 ± 13% numa-vmstat.node0.numa_local
17117 -65.8% 5847 ±117% numa-vmstat.node0.numa_other
31111 +18.3% 36809 ± 4% numa-vmstat.node1.nr_active_anon
18777 ± 2% +48.8% 27945 ± 16% numa-vmstat.node1.nr_anon_pages
699.50 ± 3% +22.0% 853.50 ± 16% numa-vmstat.node1.nr_page_table_pages
13737 +11.2% 15279 ± 7% numa-vmstat.node1.nr_slab_unreclaimable
31111 +18.3% 36809 ± 4% numa-vmstat.node1.nr_zone_active_anon
426549 +32.8% 566367 ± 15% numa-vmstat.node1.numa_hit
288820 +44.7% 417867 ± 20% numa-vmstat.node1.numa_local
0.00 -0.0 0.00 ± 2% perf-stat.i.dTLB-load-miss-rate%
195436 -4.1% 187470 ± 3% perf-stat.i.dTLB-load-misses
67.36 -0.7 66.69 perf-stat.i.iTLB-load-miss-rate%
1.66e+08 +2.8% 1.707e+08 perf-stat.i.iTLB-loads
51056 -8.1% 46912 ± 2% perf-stat.i.node-load-misses
7153 -30.0% 5006 ± 21% perf-stat.i.node-loads
0.00 -0.0 0.00 ± 3% perf-stat.overall.dTLB-load-miss-rate%
67.37 -0.7 66.69 perf-stat.overall.iTLB-load-miss-rate%
194906 -4.1% 186955 ± 3% perf-stat.ps.dTLB-load-misses
1.655e+08 +2.8% 1.701e+08 perf-stat.ps.iTLB-loads
50904 -8.1% 46767 ± 2% perf-stat.ps.node-load-misses
7136 -30.0% 4996 ± 21% perf-stat.ps.node-loads
352.50 ± 10% +32.5% 467.00 ± 14% interrupts.32:PCI-MSI.3145729-edge.eth0-TxRx-0
222.00 ± 16% +84.0% 408.50 ± 27% interrupts.34:PCI-MSI.3145731-edge.eth0-TxRx-2
275.00 ± 12% -25.7% 204.25 ± 17% interrupts.37:PCI-MSI.3145734-edge.eth0-TxRx-5
352.50 ± 10% +32.5% 467.00 ± 14% interrupts.CPU11.32:PCI-MSI.3145729-edge.eth0-TxRx-0
651.50 ± 59% -76.4% 154.00 ± 59% interrupts.CPU11.RES:Rescheduling_interrupts
222.00 ± 16% +84.0% 408.50 ± 27% interrupts.CPU13.34:PCI-MSI.3145731-edge.eth0-TxRx-2
1011 ± 2% -82.7% 175.25 ± 50% interrupts.CPU13.RES:Rescheduling_interrupts
1680 ± 26% -64.1% 602.75 ± 85% interrupts.CPU14.RES:Rescheduling_interrupts
472.50 ± 53% -54.1% 217.00 ± 95% interrupts.CPU15.RES:Rescheduling_interrupts
275.00 ± 12% -25.7% 204.25 ± 17% interrupts.CPU16.37:PCI-MSI.3145734-edge.eth0-TxRx-5
635.00 ± 62% -54.5% 289.00 ±113% interrupts.CPU17.RES:Rescheduling_interrupts
903.00 ± 12% -60.4% 357.50 ± 82% interrupts.CPU18.RES:Rescheduling_interrupts
109.00 ± 30% +627.1% 792.50 ± 99% interrupts.CPU24.RES:Rescheduling_interrupts
146.00 ± 8% +365.9% 680.25 ± 62% interrupts.CPU28.RES:Rescheduling_interrupts
1609 ± 53% -79.9% 323.50 ± 83% interrupts.CPU3.RES:Rescheduling_interrupts
125.50 ± 25% +1227.3% 1665 ± 59% interrupts.CPU30.RES:Rescheduling_interrupts
381.00 ± 48% +240.0% 1295 ± 33% interrupts.CPU31.RES:Rescheduling_interrupts
127.50 ± 29% +1944.3% 2606 ± 69% interrupts.CPU36.RES:Rescheduling_interrupts
132.00 ± 22% +141.3% 318.50 ± 23% interrupts.CPU40.RES:Rescheduling_interrupts
255.50 ± 69% +377.2% 1219 ± 71% interrupts.CPU42.RES:Rescheduling_interrupts
965.00 ± 90% -93.8% 60.00 ± 93% interrupts.CPU45.RES:Rescheduling_interrupts
945.50 ± 96% -97.6% 22.75 ± 54% interrupts.CPU49.RES:Rescheduling_interrupts
231.50 ± 85% -88.1% 27.50 ± 89% interrupts.CPU50.RES:Rescheduling_interrupts
345.50 ± 91% -95.0% 17.25 ± 70% interrupts.CPU57.RES:Rescheduling_interrupts
71.50 ± 44% -47.6% 37.50 ± 88% interrupts.CPU59.RES:Rescheduling_interrupts
7877 -37.5% 4926 ± 34% interrupts.CPU6.NMI:Non-maskable_interrupts
7877 -37.5% 4926 ± 34% interrupts.CPU6.PMI:Performance_monitoring_interrupts
1167 ± 44% -88.0% 139.75 ± 60% interrupts.CPU6.RES:Rescheduling_interrupts
331.50 ± 91% -94.7% 17.50 ± 62% interrupts.CPU61.RES:Rescheduling_interrupts
136.50 ± 4% -81.7% 25.00 ± 57% interrupts.CPU64.RES:Rescheduling_interrupts
7897 -37.8% 4913 ± 34% interrupts.CPU7.NMI:Non-maskable_interrupts
7897 -37.8% 4913 ± 34% interrupts.CPU7.PMI:Performance_monitoring_interrupts
1356 ± 18% -65.8% 463.75 ± 6% interrupts.CPU7.RES:Rescheduling_interrupts
27.50 ± 5% +335.5% 119.75 ±101% interrupts.CPU73.RES:Rescheduling_interrupts
203.00 ± 13% -80.3% 40.00 ± 31% interrupts.CPU75.RES:Rescheduling_interrupts
7885 -37.4% 4934 ± 34% interrupts.CPU8.NMI:Non-maskable_interrupts
7885 -37.4% 4934 ± 34% interrupts.CPU8.PMI:Performance_monitoring_interrupts
1232 ± 51% -80.8% 236.50 ± 85% interrupts.CPU8.RES:Rescheduling_interrupts
295.00 ± 61% -58.7% 121.75 ± 65% interrupts.CPU87.RES:Rescheduling_interrupts



***************************************************************************************************
lkp-knm02: 272 threads Intel(R) Xeon Phi(TM) CPU 7255 @ 1.10GHz with 112G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime:
os/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-knm02/stress-ng/1s

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
7:4 58% 10:4 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.key_lookup.lookup_user_key.keyctl_set_timeout
:4 50% 2:4 dmesg.RIP:get_gup_pin_page
:4 50% 2:4 dmesg.WARNING:at_mm/gup.c:#get_gup_pin_page
:4 100% 4:4 kmsg.Memory_failure:#:dirty_LRU_page_still_referenced_by#users
:4 100% 4:4 kmsg.Memory_failure:#:recovery_action_for_dirty_LRU_page:Failed
4:4 -100% :4 kmsg.Memory_failure:#:recovery_action_for_dirty_LRU_page:Recovered
%stddev %change %stddev
\ | \
291197 -55.0% 131136 ± 11% stress-ng.futex.ops
290079 -58.0% 121851 ± 12% stress-ng.futex.ops_per_sec
2015 ± 80% +203.6% 6119 ± 61% stress-ng.io.ops_per_sec
122468 ± 2% +22.4% 149941 ± 2% stress-ng.mlock.ops_per_sec
65.00 ± 6% +251.2% 228.25 ± 5% stress-ng.numa.ops
45.61 ± 5% +383.3% 220.41 ± 5% stress-ng.numa.ops_per_sec
625254 ± 6% -23.1% 480881 ± 10% stress-ng.sem.ops
624395 ± 6% -23.1% 480000 ± 10% stress-ng.sem.ops_per_sec
14903791 -2.6% 14517111 stress-ng.time.voluntary_context_switches
306260 ± 4% -8.3% 280934 ± 5% stress-ng.userfaultfd.ops
306179 ± 4% -8.2% 280922 ± 5% stress-ng.userfaultfd.ops_per_sec
4157856 -49.5% 2098955 stress-ng.vm-splice.ops
4161453 -13.4% 3603810 stress-ng.vm-splice.ops_per_sec
1.53e+08 ± 2% +3.1% 1.578e+08 perf-stat.ps.iTLB-load-misses
4293169 ± 66% -44.9% 2363760 ± 2% cpuidle.POLL.time
81481 ± 48% -34.3% 53539 ± 2% cpuidle.POLL.usage
33283 ± 6% -59.2% 13575 ± 11% numa-numastat.node1.numa_hit
33283 ± 6% -59.2% 13575 ± 11% numa-numastat.node1.other_node
564127 ± 4% +185.9% 1612672 ± 19% meminfo.Active
563846 ± 4% +185.9% 1612232 ± 19% meminfo.Active(anon)
485.00 ± 22% +118.9% 1061 ± 7% meminfo.Inactive(file)
4144644 +26.3% 5236228 ± 6% meminfo.Memused
510156 ± 3% +205.4% 1558069 ± 20% numa-meminfo.node0.Active
509874 ± 3% +205.5% 1557628 ± 20% numa-meminfo.node0.Active(anon)
474.50 ± 21% +123.8% 1061 ± 9% numa-meminfo.node0.Inactive(file)
3534355 +30.6% 4617529 ± 7% numa-meminfo.node0.MemUsed
123321 ± 4% +218.3% 392482 ± 20% numa-vmstat.node0.nr_active_anon
70.00 ± 19% +58.9% 111.25 ± 3% numa-vmstat.node0.nr_active_file
119.00 ± 21% +124.2% 266.75 ± 8% numa-vmstat.node0.nr_inactive_file
123242 ± 4% +218.4% 392451 ± 20% numa-vmstat.node0.nr_zone_active_anon
70.00 ± 19% +58.9% 111.25 ± 3% numa-vmstat.node0.nr_zone_active_file
119.00 ± 21% +124.2% 266.75 ± 8% numa-vmstat.node0.nr_zone_inactive_file
3053 ± 6% -14.4% 2615 ± 12% sched_debug.cfs_rq:/.exec_clock.stddev
1.46 ± 6% -15.1% 1.24 ± 5% sched_debug.cfs_rq:/.nr_running.max
585.54 ± 15% -35.4% 378.27 ± 19% sched_debug.cfs_rq:/.runnable_load_avg.max
567.38 ± 15% -37.7% 353.29 ± 11% sched_debug.cpu.cpu_load[0].max
93234 ± 21% -24.0% 70869 ± 10% sched_debug.cpu.curr->pid.avg
185032 ± 4% -13.7% 159654 ± 10% sched_debug.cpu.curr->pid.max
23033 ± 9% -9.2% 20906 ± 6% sched_debug.cpu.sched_goidle.max
2420 ± 6% -13.9% 2085 ± 4% sched_debug.cpu.sched_goidle.stddev
38.68 ± 40% -60.9% 15.14 ± 30% sched_debug.rt_rq:/.rt_time.max
7.03 ± 67% -61.5% 2.71 ± 40% sched_debug.rt_rq:/.rt_time.stddev
134783 ± 5% +190.2% 391176 ± 17% proc-vmstat.nr_active_anon
70.25 ± 17% +58.4% 111.25 ± 4% proc-vmstat.nr_active_file
121.75 ± 20% +117.9% 265.25 ± 7% proc-vmstat.nr_inactive_file
131.00 ± 23% -99.8% 0.25 ±173% proc-vmstat.nr_isolated_anon
105352 -2.8% 102442 proc-vmstat.nr_slab_reclaimable
134783 ± 5% +190.2% 391176 ± 17% proc-vmstat.nr_zone_active_anon
70.25 ± 17% +58.4% 111.25 ± 4% proc-vmstat.nr_zone_active_file
121.75 ± 20% +117.9% 265.25 ± 7% proc-vmstat.nr_zone_inactive_file
864.50 ± 25% +638.5% 6384 ± 83% proc-vmstat.numa_huge_pte_updates
33283 ± 6% -59.2% 13575 ± 11% proc-vmstat.numa_other
445254 ± 25% +634.9% 3272027 ± 83% proc-vmstat.numa_pte_updates
466.00 ± 12% +382.9% 2250 ± 9% proc-vmstat.pgmigrate_fail
49887 ± 6% -92.6% 3698 ± 33% proc-vmstat.pgmigrate_success
7822768 -1.3% 7721955 proc-vmstat.unevictable_pgs_culled
352423 ± 2% +4.2% 367091 proc-vmstat.unevictable_pgs_mlocked
351801 ± 2% +4.1% 366356 proc-vmstat.unevictable_pgs_munlocked
351393 ± 2% +4.1% 365940 proc-vmstat.unevictable_pgs_rescued
11075 ± 8% -16.5% 9245 ± 9% softirqs.CPU100.NET_RX
8323 ± 9% +47.3% 12259 ± 23% softirqs.CPU111.NET_RX
10330 ± 11% -17.8% 8494 ± 5% softirqs.CPU120.NET_RX
28219 ± 4% -13.9% 24292 ± 3% softirqs.CPU123.RCU
19316 ± 6% -8.9% 17600 ± 4% softirqs.CPU127.SCHED
11518 ± 2% -26.4% 8477 ± 9% softirqs.CPU128.NET_RX
6724 ± 22% +67.4% 11258 ± 24% softirqs.CPU131.NET_RX
8571 ± 33% +62.2% 13903 ± 24% softirqs.CPU138.NET_RX
7441 ± 4% +33.9% 9965 ± 14% softirqs.CPU141.NET_RX
8270 ± 16% +44.7% 11966 ± 15% softirqs.CPU146.NET_RX
11298 ± 20% -36.7% 7155 ± 18% softirqs.CPU148.NET_RX
8935 ± 23% +33.2% 11900 ± 29% softirqs.CPU153.NET_RX
19787 ± 5% -9.6% 17897 ± 3% softirqs.CPU163.SCHED
26941 ± 5% -9.5% 24385 ± 3% softirqs.CPU169.RCU
12370 ± 25% -34.7% 8074 ± 13% softirqs.CPU177.NET_RX
20748 ± 6% -10.9% 18487 ± 4% softirqs.CPU181.SCHED
20707 ± 9% -16.1% 17369 ± 2% softirqs.CPU186.SCHED
19622 ± 6% -10.7% 17522 ± 3% softirqs.CPU187.SCHED
19406 ± 6% -8.9% 17672 ± 2% softirqs.CPU194.SCHED
32272 ± 8% -14.2% 27679 ± 7% softirqs.CPU20.RCU
12603 ± 21% -29.4% 8893 ± 16% softirqs.CPU209.NET_RX
11432 ± 9% -14.7% 9749 ± 14% softirqs.CPU21.NET_RX
9075 ± 9% -19.4% 7310 ± 7% softirqs.CPU214.NET_RX
8118 ± 9% +55.0% 12584 ± 24% softirqs.CPU240.NET_RX
9232 ± 16% -27.3% 6708 ± 9% softirqs.CPU244.NET_RX
8203 ± 15% +32.8% 10895 ± 7% softirqs.CPU265.NET_RX
14174 ± 8% -31.4% 9727 ± 23% softirqs.CPU267.NET_RX
17401 ± 11% -55.9% 7675 ± 21% softirqs.CPU28.NET_RX
30370 ± 4% -12.7% 26507 ± 3% softirqs.CPU3.RCU
11791 ± 4% -34.4% 7732 ± 13% softirqs.CPU46.NET_RX
19641 ± 7% -10.5% 17574 ± 4% softirqs.CPU51.SCHED
9879 ± 18% +21.5% 12003 ± 13% softirqs.CPU53.NET_RX
9572 ± 7% -8.3% 8781 ± 6% softirqs.CPU58.NET_RX
11786 ± 12% -33.7% 7818 ± 4% softirqs.CPU6.NET_RX
19919 ± 8% -12.9% 17341 ± 4% softirqs.CPU69.SCHED
8365 ± 15% +64.2% 13734 ± 35% softirqs.CPU76.NET_RX
7648 ± 11% +38.0% 10557 ± 17% softirqs.CPU77.NET_RX
9286 ± 15% +44.2% 13394 ± 16% softirqs.CPU88.NET_RX
27697 ± 4% -11.6% 24482 ± 3% softirqs.CPU90.RCU
8763 ± 16% +34.9% 11818 ± 22% softirqs.CPU94.NET_RX
2680 ± 15% -51.8% 1293 ± 28% interrupts.CPU100.NMI:Non-maskable_interrupts
2680 ± 15% -51.8% 1293 ± 28% interrupts.CPU100.PMI:Performance_monitoring_interrupts
2219 ± 20% -30.4% 1545 ± 22% interrupts.CPU104.NMI:Non-maskable_interrupts
2219 ± 20% -30.4% 1545 ± 22% interrupts.CPU104.PMI:Performance_monitoring_interrupts
2314 ± 19% -31.7% 1580 ± 10% interrupts.CPU106.TLB:TLB_shootdowns
2699 ± 18% -30.8% 1868 ± 10% interrupts.CPU108.TLB:TLB_shootdowns
17763 ± 25% +63.0% 28961 ± 13% interrupts.CPU113.RES:Rescheduling_interrupts
16871 ± 22% +88.1% 31734 ± 27% interrupts.CPU125.RES:Rescheduling_interrupts
24894 ± 18% +27.6% 31769 ± 8% interrupts.CPU128.RES:Rescheduling_interrupts
1755 ± 23% +77.4% 3112 ± 57% interrupts.CPU130.TLB:TLB_shootdowns
2644 ± 31% -35.0% 1719 ± 20% interrupts.CPU142.TLB:TLB_shootdowns
1164 ± 14% +64.3% 1912 ± 46% interrupts.CPU145.NMI:Non-maskable_interrupts
1164 ± 14% +64.3% 1912 ± 46% interrupts.CPU145.PMI:Performance_monitoring_interrupts
23656 ± 38% +53.7% 36365 ± 30% interrupts.CPU150.RES:Rescheduling_interrupts
27724 ± 6% -30.2% 19354 ± 16% interrupts.CPU151.RES:Rescheduling_interrupts
2491 ± 23% -29.6% 1754 ± 18% interrupts.CPU155.TLB:TLB_shootdowns
2153 ± 17% +16.1% 2499 ± 16% interrupts.CPU167.TLB:TLB_shootdowns
2170 ± 18% -20.0% 1735 ± 11% interrupts.CPU169.TLB:TLB_shootdowns
2004 ± 30% -33.3% 1337 ± 17% interrupts.CPU170.NMI:Non-maskable_interrupts
2004 ± 30% -33.3% 1337 ± 17% interrupts.CPU170.PMI:Performance_monitoring_interrupts
2079 ± 20% -25.1% 1557 ± 8% interrupts.CPU171.TLB:TLB_shootdowns
1754 ± 10% +44.1% 2527 ± 17% interrupts.CPU187.TLB:TLB_shootdowns
1632 ± 12% +45.6% 2376 ± 21% interrupts.CPU191.TLB:TLB_shootdowns
3271 ± 49% -50.5% 1618 ± 49% interrupts.CPU199.NMI:Non-maskable_interrupts
3271 ± 49% -50.5% 1618 ± 49% interrupts.CPU199.PMI:Performance_monitoring_interrupts
3481 ± 41% -52.1% 1668 ± 33% interrupts.CPU200.NMI:Non-maskable_interrupts
3481 ± 41% -52.1% 1668 ± 33% interrupts.CPU200.PMI:Performance_monitoring_interrupts
20086 ± 22% +84.1% 36982 ± 5% interrupts.CPU212.RES:Rescheduling_interrupts
1738 ± 12% +17.8% 2048 ± 5% interrupts.CPU212.TLB:TLB_shootdowns
18973 ± 23% +51.8% 28802 ± 19% interrupts.CPU216.RES:Rescheduling_interrupts
2301 ± 28% -51.6% 1113 ± 41% interrupts.CPU218.NMI:Non-maskable_interrupts
2301 ± 28% -51.6% 1113 ± 41% interrupts.CPU218.PMI:Performance_monitoring_interrupts
2267 ± 10% -16.8% 1887 ± 15% interrupts.CPU220.TLB:TLB_shootdowns
15399 ± 14% +61.3% 24836 ± 7% interrupts.CPU222.RES:Rescheduling_interrupts
36076 ± 16% -43.8% 20263 ± 36% interrupts.CPU226.RES:Rescheduling_interrupts
19194 ± 10% +84.8% 35478 ± 13% interrupts.CPU227.RES:Rescheduling_interrupts
2425 ± 8% -19.9% 1942 ± 6% interrupts.CPU227.TLB:TLB_shootdowns
2303 ± 22% -23.9% 1753 ± 8% interrupts.CPU234.TLB:TLB_shootdowns
2837 ± 8% -49.9% 1421 ± 11% interrupts.CPU239.NMI:Non-maskable_interrupts
2837 ± 8% -49.9% 1421 ± 11% interrupts.CPU239.PMI:Performance_monitoring_interrupts
17386 ± 17% +84.6% 32093 ± 30% interrupts.CPU240.RES:Rescheduling_interrupts
18801 ± 55% +92.8% 36248 ± 23% interrupts.CPU246.RES:Rescheduling_interrupts
2469 ± 31% -29.8% 1732 ± 17% interrupts.CPU247.TLB:TLB_shootdowns
31757 ± 27% -35.3% 20541 ± 31% interrupts.CPU250.RES:Rescheduling_interrupts
36905 ± 40% -46.7% 19670 ± 8% interrupts.CPU252.RES:Rescheduling_interrupts
20156 ± 12% +44.7% 29156 ± 7% interrupts.CPU253.RES:Rescheduling_interrupts
4880 ± 30% -48.0% 2539 ± 10% interrupts.CPU26.TLB:TLB_shootdowns
30073 ± 26% -37.2% 18889 ± 28% interrupts.CPU260.RES:Rescheduling_interrupts
24487 ± 4% -23.7% 18693 ± 23% interrupts.CPU261.RES:Rescheduling_interrupts
2201 ± 26% -42.9% 1257 ± 29% interrupts.CPU262.NMI:Non-maskable_interrupts
2201 ± 26% -42.9% 1257 ± 29% interrupts.CPU262.PMI:Performance_monitoring_interrupts
16758 ± 30% +76.3% 29544 ± 16% interrupts.CPU263.RES:Rescheduling_interrupts
2958 ± 56% -54.4% 1350 ± 25% interrupts.CPU266.NMI:Non-maskable_interrupts
2958 ± 56% -54.4% 1350 ± 25% interrupts.CPU266.PMI:Performance_monitoring_interrupts
29594 ± 23% -35.9% 18979 ± 33% interrupts.CPU268.RES:Rescheduling_interrupts
2391 ± 20% -42.8% 1367 ± 25% interrupts.CPU27.NMI:Non-maskable_interrupts
2391 ± 20% -42.8% 1367 ± 25% interrupts.CPU27.PMI:Performance_monitoring_interrupts
17135 ± 26% +74.3% 29873 ± 20% interrupts.CPU32.RES:Rescheduling_interrupts
32963 ± 29% -31.4% 22602 ± 27% interrupts.CPU36.RES:Rescheduling_interrupts
2328 ± 28% -34.4% 1527 ± 50% interrupts.CPU39.NMI:Non-maskable_interrupts
2328 ± 28% -34.4% 1527 ± 50% interrupts.CPU39.PMI:Performance_monitoring_interrupts
16869 ± 24% +69.0% 28507 ± 22% interrupts.CPU39.RES:Rescheduling_interrupts
2584 ± 21% -26.7% 1895 ± 16% interrupts.CPU51.TLB:TLB_shootdowns
3155 ± 16% -36.1% 2015 ± 20% interrupts.CPU52.TLB:TLB_shootdowns
4951 ± 28% -60.4% 1961 ± 25% interrupts.CPU55.TLB:TLB_shootdowns
4382 ± 24% -50.2% 2183 ± 14% interrupts.CPU56.TLB:TLB_shootdowns
15835 ± 21% +108.6% 33033 ± 30% interrupts.CPU6.RES:Rescheduling_interrupts
3549 ± 29% -34.9% 2310 ± 16% interrupts.CPU61.TLB:TLB_shootdowns
3466 ± 41% -51.2% 1691 ± 58% interrupts.CPU67.NMI:Non-maskable_interrupts
3466 ± 41% -51.2% 1691 ± 58% interrupts.CPU67.PMI:Performance_monitoring_interrupts
35989 ± 19% -36.0% 23024 ± 27% interrupts.CPU75.RES:Rescheduling_interrupts
2210 ± 6% -30.4% 1538 ± 14% interrupts.CPU76.TLB:TLB_shootdowns
3307 ± 45% -62.2% 1250 ± 14% interrupts.CPU78.NMI:Non-maskable_interrupts
3307 ± 45% -62.2% 1250 ± 14% interrupts.CPU78.PMI:Performance_monitoring_interrupts
1604 +54.6% 2480 ± 12% interrupts.CPU78.TLB:TLB_shootdowns
2863 ± 57% -42.8% 1636 ± 11% interrupts.CPU90.NMI:Non-maskable_interrupts
2863 ± 57% -42.8% 1636 ± 11% interrupts.CPU90.PMI:Performance_monitoring_interrupts
2308 ± 11% -21.7% 1808 ± 8% interrupts.CPU91.TLB:TLB_shootdowns
18040 ± 36% +89.6% 34197 ± 9% interrupts.CPU94.RES:Rescheduling_interrupts
21085 ± 18% +51.0% 31835 ± 16% interrupts.CPU97.RES:Rescheduling_interrupts



***************************************************************************************************
lkp-bdw-ep3: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
vm/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3/stress-ng/1s/0xb00002e

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 100% 4:4 kmsg.Memory_failure:#:dirty_LRU_page_still_referenced_by#users
:4 100% 4:4 kmsg.Memory_failure:#:recovery_action_for_dirty_LRU_page:Failed
4:4 -100% :4 kmsg.Memory_failure:#:recovery_action_for_dirty_LRU_page:Recovered
:4 100% 4:4 dmesg.RIP:get_gup_pin_page
1:4 -25% :4 dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
:4 100% 4:4 dmesg.WARNING:at_mm/gup.c:#get_gup_pin_page
1:4 -25% :4 dmesg.WARNING:stack_recursion
%stddev %change %stddev
\ | \
7765 -3.4% 7499 stress-ng.time.percent_of_cpu_this_job_got
1880 -5.1% 1784 stress-ng.time.system_time
5023675 ± 2% -58.3% 2095111 stress-ng.vm-splice.ops
17.91 ± 8% +12.9% 20.22 ± 6% iostat.cpu.idle
2406 -3.4% 2325 turbostat.Avg_MHz
59555797 -11.7% 52587698 vmstat.memory.free
15.47 ± 9% +2.3 17.80 ± 7% mpstat.cpu.idle%
0.10 ± 8% -0.0 0.08 ± 8% mpstat.cpu.soft%
34300 ± 2% -5.0% 32577 perf-stat.ps.major-faults
170288 -0.9% 168728 perf-stat.ps.msec
1131 ± 7% -17.7% 930.75 ± 14% slabinfo.nsproxy.active_objs
1131 ± 7% -17.7% 930.75 ± 14% slabinfo.nsproxy.num_objs
4662 ± 9% +69.2% 7887 ± 49% softirqs.CPU22.RCU
3963 ± 8% +151.7% 9974 ± 50% softirqs.CPU78.RCU
2905249 ± 10% +192.0% 8483775 ± 3% meminfo.Active
2905081 ± 10% +192.0% 8483607 ± 3% meminfo.Active(anon)
23150145 ± 4% -9.9% 20851826 ± 3% meminfo.Committed_AS
1365340 ± 5% +84.5% 2518784 ± 4% meminfo.Inactive
1364952 ± 5% +84.1% 2512920 ± 4% meminfo.Inactive(anon)
387.25 ± 6% +1414.2% 5863 ± 4% meminfo.Inactive(file)
59176014 -11.4% 52417164 meminfo.MemAvailable
59432778 -11.4% 52671931 meminfo.MemFree
6419352 ± 3% +105.3% 13180195 ± 2% meminfo.Memused
1.88 ± 6% +110.3% 3.96 ± 39% sched_debug.cpu.cpu_load[1].avg
19.25 ± 2% +629.9% 140.50 ± 62% sched_debug.cpu.cpu_load[1].max
4.41 ± 4% +258.0% 15.78 ± 53% sched_debug.cpu.cpu_load[1].stddev
2.31 ± 14% +94.2% 4.48 ± 31% sched_debug.cpu.cpu_load[2].avg
26.25 ± 47% +471.4% 150.00 ± 41% sched_debug.cpu.cpu_load[2].max
4.71 ± 21% +248.5% 16.42 ± 38% sched_debug.cpu.cpu_load[2].stddev
37.25 ± 37% +208.1% 114.75 ± 37% sched_debug.cpu.cpu_load[3].max
5.87 ± 19% +125.8% 13.25 ± 34% sched_debug.cpu.cpu_load[3].stddev
33.00 ± 19% +122.0% 73.25 ± 46% sched_debug.cpu.cpu_load[4].max
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.static_key_slow_dec.sw_perf_event_destroy._free_event.perf_event_release_kernel.perf_release
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.__static_key_slow_dec_cpuslocked.static_key_slow_dec.sw_perf_event_destroy._free_event.perf_event_release_kernel
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.__jump_label_update.__static_key_slow_dec_cpuslocked.static_key_slow_dec.sw_perf_event_destroy._free_event
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.arch_jump_label_transform.__jump_label_update.__static_key_slow_dec_cpuslocked.static_key_slow_dec.sw_perf_event_destroy
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.__jump_label_transform.arch_jump_label_transform.__jump_label_update.__static_key_slow_dec_cpuslocked.static_key_slow_dec
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.smp_call_function_many.on_each_cpu.text_poke_bp.__jump_label_transform.arch_jump_label_transform
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.on_each_cpu.text_poke_bp.__jump_label_transform.arch_jump_label_transform.__jump_label_update
11.46 ± 47% -7.5 3.97 ± 98% perf-profile.calltrace.cycles-pp.text_poke_bp.__jump_label_transform.arch_jump_label_transform.__jump_label_update.__static_key_slow_dec_cpuslocked
11.46 ± 47% -6.7 4.78 ± 69% perf-profile.calltrace.cycles-pp._free_event.perf_event_release_kernel.perf_release.__fput.task_work_run
11.46 ± 47% -6.7 4.78 ± 69% perf-profile.calltrace.cycles-pp.sw_perf_event_destroy._free_event.perf_event_release_kernel.perf_release.__fput
695203 ± 5% +213.3% 2177910 ± 3% proc-vmstat.nr_active_anon
1473798 -11.8% 1299346 proc-vmstat.nr_dirty_background_threshold
2952003 -11.8% 2603072 proc-vmstat.nr_dirty_threshold
14852961 -11.8% 13107633 proc-vmstat.nr_free_pages
377237 ± 7% +69.4% 639157 ± 3% proc-vmstat.nr_inactive_anon
97.00 ± 3% +1395.6% 1450 ± 6% proc-vmstat.nr_inactive_file
695202 ± 5% +213.3% 2177910 ± 3% proc-vmstat.nr_zone_active_anon
377237 ± 7% +69.4% 639157 ± 3% proc-vmstat.nr_zone_inactive_anon
97.00 ± 3% +1395.6% 1450 ± 6% proc-vmstat.nr_zone_inactive_file
12792774 ± 2% -10.7% 11419593 ± 7% proc-vmstat.pgactivate
2113 -3.3% 2043 ± 2% proc-vmstat.thp_split_page
1404817 ± 4% +213.6% 4405232 ± 9% numa-meminfo.node0.Active
1404733 ± 4% +213.6% 4405148 ± 9% numa-meminfo.node0.Active(anon)
828406 ± 12% +51.0% 1250665 ± 10% numa-meminfo.node0.Inactive
828190 ± 12% +50.7% 1247741 ± 10% numa-meminfo.node0.Inactive(anon)
214.75 ± 59% +1261.6% 2924 ± 10% numa-meminfo.node0.Inactive(file)
29574615 -11.6% 26131731 ± 2% numa-meminfo.node0.MemFree
3284371 ± 6% +104.8% 6727255 ± 8% numa-meminfo.node0.MemUsed
1476404 ± 11% +204.2% 4491068 ± 2% numa-meminfo.node1.Active
1476320 ± 11% +204.2% 4490984 ± 2% numa-meminfo.node1.Active(anon)
684877 ± 19% +87.8% 1286317 ± 9% numa-meminfo.node1.Inactive
684690 ± 19% +87.5% 1283519 ± 9% numa-meminfo.node1.Inactive(anon)
186.50 ± 65% +1400.3% 2798 ± 6% numa-meminfo.node1.Inactive(file)
29743730 -12.1% 26154353 numa-meminfo.node1.MemFree
3249412 ± 9% +110.5% 6838786 ± 3% numa-meminfo.node1.MemUsed
377498 ± 12% +193.5% 1108039 ± 9% numa-vmstat.node0.nr_active_anon
7374428 -11.6% 6515488 ± 2% numa-vmstat.node0.nr_free_pages
203915 ± 20% +57.8% 321729 ± 8% numa-vmstat.node0.nr_inactive_anon
55.25 ± 59% +1211.3% 724.50 ± 11% numa-vmstat.node0.nr_inactive_file
377455 ± 12% +193.5% 1108012 ± 9% numa-vmstat.node0.nr_zone_active_anon
203887 ± 20% +57.8% 321697 ± 8% numa-vmstat.node0.nr_zone_inactive_anon
55.25 ± 59% +1211.3% 724.50 ± 11% numa-vmstat.node0.nr_zone_inactive_file
392594 ± 12% +181.8% 1106200 ± 3% numa-vmstat.node1.nr_active_anon
7400171 -11.6% 6543657 numa-vmstat.node1.nr_free_pages
185991 ± 20% +78.6% 332154 ± 8% numa-vmstat.node1.nr_inactive_anon
44.75 ± 65% +1458.7% 697.50 ± 5% numa-vmstat.node1.nr_inactive_file
392558 ± 12% +181.8% 1106188 ± 3% numa-vmstat.node1.nr_zone_active_anon
185980 ± 20% +78.6% 332130 ± 8% numa-vmstat.node1.nr_zone_inactive_anon
44.75 ± 65% +1458.7% 697.50 ± 5% numa-vmstat.node1.nr_zone_inactive_file
92917 -2.7% 90421 interrupts.CAL:Function_call_interrupts
537.00 ± 10% +28.6% 690.50 ± 12% interrupts.CPU13.RES:Rescheduling_interrupts
574.25 ± 6% +44.4% 829.50 ± 29% interrupts.CPU17.RES:Rescheduling_interrupts
4162 ± 12% -22.4% 3231 ± 5% interrupts.CPU22.TLB:TLB_shootdowns
4051 ± 11% -20.9% 3203 ± 3% interrupts.CPU23.TLB:TLB_shootdowns
4017 ± 5% -17.4% 3318 ± 3% interrupts.CPU24.TLB:TLB_shootdowns
4249 ± 10% -22.7% 3283 ± 8% interrupts.CPU25.TLB:TLB_shootdowns
4003 ± 7% -17.8% 3291 ± 2% interrupts.CPU26.TLB:TLB_shootdowns
4161 ± 8% -19.8% 3336 ± 7% interrupts.CPU27.TLB:TLB_shootdowns
4063 ± 5% -17.9% 3337 ± 5% interrupts.CPU28.TLB:TLB_shootdowns
3970 ± 6% -17.6% 3271 ± 4% interrupts.CPU29.TLB:TLB_shootdowns
4093 ± 7% -20.1% 3270 ± 3% interrupts.CPU30.TLB:TLB_shootdowns
4001 ± 7% -15.2% 3392 ± 9% interrupts.CPU31.TLB:TLB_shootdowns
4246 ± 7% -22.6% 3284 ± 3% interrupts.CPU32.TLB:TLB_shootdowns
3949 ± 6% -14.9% 3361 ± 6% interrupts.CPU33.TLB:TLB_shootdowns
3949 ± 5% -15.6% 3332 ± 8% interrupts.CPU34.TLB:TLB_shootdowns
4057 ± 9% -20.4% 3228 ± 3% interrupts.CPU35.TLB:TLB_shootdowns
4105 ± 7% -22.5% 3180 ± 8% interrupts.CPU36.TLB:TLB_shootdowns
3957 ± 7% -18.7% 3217 ± 7% interrupts.CPU37.TLB:TLB_shootdowns
4071 ± 8% -20.7% 3229 ± 4% interrupts.CPU38.TLB:TLB_shootdowns
4256 ± 18% -22.8% 3286 ± 4% interrupts.CPU39.TLB:TLB_shootdowns
4334 ± 2% -22.0% 3381 ± 6% interrupts.CPU40.TLB:TLB_shootdowns
3858 ± 5% -17.2% 3195 ± 7% interrupts.CPU42.TLB:TLB_shootdowns
4212 ± 9% -20.3% 3356 ± 3% interrupts.CPU43.TLB:TLB_shootdowns
3922 ± 9% -17.7% 3227 ± 4% interrupts.CPU66.TLB:TLB_shootdowns
4021 ± 3% -16.1% 3372 ± 5% interrupts.CPU67.TLB:TLB_shootdowns
4176 ± 4% -19.0% 3381 ± 4% interrupts.CPU68.TLB:TLB_shootdowns
4149 ± 5% -39.5% 2508 ± 57% interrupts.CPU69.TLB:TLB_shootdowns
3993 ± 8% -18.1% 3270 ± 3% interrupts.CPU70.TLB:TLB_shootdowns
4059 ± 4% -17.4% 3352 ± 4% interrupts.CPU71.TLB:TLB_shootdowns
514.50 ± 5% +49.7% 770.00 ± 24% interrupts.CPU72.RES:Rescheduling_interrupts
3994 ± 5% -20.9% 3158 ± 2% interrupts.CPU72.TLB:TLB_shootdowns
4034 ± 7% -17.1% 3342 ± 3% interrupts.CPU73.TLB:TLB_shootdowns
4018 ± 7% -15.1% 3412 ± 4% interrupts.CPU74.TLB:TLB_shootdowns
4052 ± 7% -14.6% 3459 ± 5% interrupts.CPU75.TLB:TLB_shootdowns
4060 ± 6% -19.6% 3263 ± 5% interrupts.CPU76.TLB:TLB_shootdowns
4025 ± 5% -20.1% 3217 ± 2% interrupts.CPU77.TLB:TLB_shootdowns
4071 ± 6% -21.5% 3197 ± 2% interrupts.CPU78.TLB:TLB_shootdowns
4045 ± 5% -17.7% 3330 ± 7% interrupts.CPU79.TLB:TLB_shootdowns
454.00 ± 36% +51.5% 688.00 ± 13% interrupts.CPU80.RES:Rescheduling_interrupts
3963 ± 5% -18.5% 3229 ± 5% interrupts.CPU81.TLB:TLB_shootdowns
4202 ± 5% -25.3% 3139 ± 5% interrupts.CPU82.TLB:TLB_shootdowns
3843 ± 3% -15.6% 3242 interrupts.CPU83.TLB:TLB_shootdowns
4169 ± 5% -40.3% 2489 ± 57% interrupts.CPU86.TLB:TLB_shootdowns
4062 ± 6% -17.6% 3345 ± 2% interrupts.CPU87.TLB:TLB_shootdowns



***************************************************************************************************
vm-snb-4G: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 4G
=========================================================================================
compiler/group/kconfig/rootfs/tbox_group/testcase:
gcc-7/kselftests-02/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/vm-snb-4G/kernel_selftests

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
4:12 -25% 1:12 kmsg.unregister_netdevice:waiting_for_veth_A-R1_to_become_free.Usage_count=
1:12 -8% :12 kmsg.veth0:Failed_to_cycle_device_veth0;route_tables_might_be_wrong
:12 100% 12:12 kernel_selftests.memfd.run_fuse_test.sh.fail
:12 8% 1:12 kernel_selftests.net.ip_defrag.sh.fail
%stddev %change %stddev
\ | \
1.00 -100.0% 0.00 kernel_selftests.memfd.run_fuse_test.sh.pass
361623 +223.2% 1168751 meminfo.Active
361623 +223.2% 1168751 meminfo.Active(anon)
127769 ± 21% -94.8% 6662 ± 26% meminfo.CmaFree
1941240 -41.3% 1139756 meminfo.MemAvailable
1895394 -42.4% 1091194 ± 2% meminfo.MemFree
2137756 +37.6% 2941952 meminfo.Memused
881.70 ± 5% +52.8% 1347 ± 4% meminfo.max_used_kB



***************************************************************************************************
lkp-bdw-ep3: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
pipe/gcc-7/performance/1HDD/x86_64-rhel-7.2/100%/debian-x86_64-2018-04-03.cgz/lkp-bdw-ep3/stress-ng/60s/0xb00002e

commit:
9627026352 ("mm: page_cache_add_speculative(): refactoring")
cdaa813278 ("mm/gup: track gup-pinned pages")

96270263521248d5 cdaa813278ddc616ee201eacda7
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 100% 4:4 dmesg.RIP:get_gup_pin_page
:4 100% 4:4 dmesg.WARNING:at_mm/gup.c:#get_gup_pin_page
%stddev %change %stddev
\ | \
360.22 -16.3% 301.59 stress-ng.time.elapsed_time
360.22 -16.3% 301.59 stress-ng.time.elapsed_time.max
8342 -1.3% 8234 stress-ng.time.percent_of_cpu_this_job_got
26890 -19.3% 21707 stress-ng.time.system_time
2.989e+08 -99.3% 2099213 stress-ng.vm-splice.ops
12010366 +10.0% 13211802 meminfo.Committed_AS
5812 ± 4% +18.0% 6856 ± 3% meminfo.max_used_kB
5.64 ± 2% +1.3 6.94 ± 2% mpstat.cpu.idle%
9.90 +1.8 11.68 ± 2% mpstat.cpu.usr%
107366 ± 93% -78.7% 22825 ± 44% numa-meminfo.node0.AnonPages
74249 ± 6% -13.8% 63969 ± 9% numa-meminfo.node0.SUnreclaim
5.90 ± 2% +22.8% 7.24 iostat.cpu.idle
84.23 -3.7% 81.11 iostat.cpu.system
9.87 +18.0% 11.64 ± 2% iostat.cpu.user
16463 +2.3% 16841 proc-vmstat.nr_kernel_stack
275921 ± 93% -50.8% 135706 ±150% proc-vmstat.numa_pte_updates
1000818 -14.3% 857522 proc-vmstat.pgfault
83.75 -3.6% 80.75 vmstat.cpu.sy
9.25 ± 4% +21.6% 11.25 ± 3% vmstat.cpu.us
1766858 +20.5% 2129192 vmstat.system.cs
185535 +1.5% 188310 vmstat.system.in
26840 ± 93% -78.8% 5697 ± 44% numa-vmstat.node0.nr_anon_pages
18563 ± 6% -13.8% 15992 ± 9% numa-vmstat.node0.nr_slab_unreclaimable
1.066e+09 ± 2% -28.2% 7.648e+08 ± 7% numa-vmstat.node0.numa_hit
1.066e+09 ± 2% -28.2% 7.648e+08 ± 7% numa-vmstat.node0.numa_local
9.301e+08 ± 2% -17.2% 7.701e+08 ± 6% numa-vmstat.node1.numa_hit
9.3e+08 ± 2% -17.2% 7.699e+08 ± 6% numa-vmstat.node1.numa_local
2.15 ± 2% -0.1 2.03 ± 4% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
0.87 -0.1 0.81 perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.vfs_read.ksys_read
1.35 -0.0 1.30 perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_read.ksys_read.do_syscall_64
0.83 -0.0 0.80 perf-profile.calltrace.cycles-pp.avc_has_perm.file_has_perm.security_file_permission.vfs_write.ksys_write
1.31 -0.0 1.27 perf-profile.calltrace.cycles-pp.file_has_perm.security_file_permission.vfs_write.ksys_write.do_syscall_64
0.55 +0.0 0.57 perf-profile.calltrace.cycles-pp.__inode_security_revalidate.selinux_file_permission.security_file_permission.vfs_write.ksys_write
2.09 +0.1 2.15 perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.pipe_write.__vfs_write.vfs_write.ksys_write
2.62 ± 2% +0.1 2.68 perf-profile.calltrace.cycles-pp.pipe_wait.pipe_write.__vfs_write.vfs_write.ksys_write
2650 -1.3% 2616 turbostat.Avg_MHz
3.85 ± 2% +0.7 4.55 turbostat.C1%
0.54 ± 7% +0.2 0.70 ± 12% turbostat.C1E%
4.71 ± 2% +21.7% 5.73 turbostat.CPU%c1
77.50 ± 2% -14.2% 66.50 ± 2% turbostat.CoreTmp
67302470 -14.9% 57247592 turbostat.IRQ
81.75 ± 2% -13.1% 71.00 turbostat.PkgTmp
261.28 +3.9% 271.39 turbostat.PkgWatt
1285315 ± 8% +15.7% 1487427 ± 6% sched_debug.cfs_rq:/.MIN_vruntime.stddev
173175 -16.1% 145379 sched_debug.cfs_rq:/.exec_clock.avg
174188 -16.1% 146168 sched_debug.cfs_rq:/.exec_clock.max
171345 -16.0% 143970 sched_debug.cfs_rq:/.exec_clock.min
5689 ± 7% -27.5% 4124 ± 10% sched_debug.cfs_rq:/.load.min
168.54 ± 6% +12.3% 189.25 ± 6% sched_debug.cfs_rq:/.load_avg.max
1285315 ± 8% +15.7% 1487427 ± 6% sched_debug.cfs_rq:/.max_vruntime.stddev
16247293 -15.8% 13678528 sched_debug.cfs_rq:/.min_vruntime.avg
17621646 ± 2% -13.2% 15289779 ± 3% sched_debug.cfs_rq:/.min_vruntime.max
14916399 -18.2% 12204755 ± 3% sched_debug.cfs_rq:/.min_vruntime.min
0.13 ± 6% +21.7% 0.16 ± 8% sched_debug.cfs_rq:/.nr_running.stddev
4.96 ± 4% -20.3% 3.96 ± 9% sched_debug.cfs_rq:/.runnable_load_avg.min
5426 ± 8% -19.0% 4397 ± 11% sched_debug.cfs_rq:/.runnable_weight.min
10113 ± 21% -48.9% 5166 ± 13% sched_debug.cpu.avg_idle.min
222767 -13.4% 192825 sched_debug.cpu.clock.avg
222785 -13.4% 192841 sched_debug.cpu.clock.max
222745 -13.4% 192807 sched_debug.cpu.clock.min
11.90 ± 12% -17.7% 9.80 ± 12% sched_debug.cpu.clock.stddev
222767 -13.4% 192825 sched_debug.cpu.clock_task.avg
222785 -13.4% 192841 sched_debug.cpu.clock_task.max
222745 -13.4% 192807 sched_debug.cpu.clock_task.min
11.90 ± 12% -17.7% 9.80 ± 12% sched_debug.cpu.clock_task.stddev
5.21 ± 3% -12.9% 4.54 ± 3% sched_debug.cpu.cpu_load[0].min
4955 -19.3% 4001 sched_debug.cpu.curr->pid.avg
6909 -11.6% 6108 sched_debug.cpu.curr->pid.max
11634 ± 7% -10.9% 10363 sched_debug.cpu.load.avg
193768 -15.5% 163653 sched_debug.cpu.nr_load_updates.avg
197917 -15.6% 167117 sched_debug.cpu.nr_load_updates.max
191959 -15.6% 162097 sched_debug.cpu.nr_load_updates.min
0.50 ± 6% +10.8% 0.56 ± 6% sched_debug.cpu.nr_running.stddev
4644436 ± 2% -7.8% 4280609 ± 2% sched_debug.cpu.nr_switches.avg
4644663 ± 2% -7.8% 4280771 ± 2% sched_debug.cpu.sched_count.avg
409748 ± 4% -22.5% 317516 ± 3% sched_debug.cpu.sched_goidle.avg
572470 ± 11% -22.4% 444043 ± 13% sched_debug.cpu.sched_goidle.max
286861 ± 7% -22.8% 221401 ± 8% sched_debug.cpu.sched_goidle.min
222744 -13.4% 192807 sched_debug.cpu_clk
219014 -13.7% 189076 sched_debug.ktime
223427 -13.4% 193488 sched_debug.sched_clk
14.20 +15.8% 16.44 perf-stat.i.MPKI
3.166e+10 +17.1% 3.708e+10 perf-stat.i.branch-instructions
1.57 +0.0 1.59 perf-stat.i.branch-miss-rate%
4.847e+08 +16.6% 5.653e+08 ± 2% perf-stat.i.branch-misses
12.07 ± 23% -7.1 4.97 ± 55% perf-stat.i.cache-miss-rate%
9.997e+08 +18.7% 1.187e+09 perf-stat.i.cache-references
1782714 +20.5% 2148963 perf-stat.i.context-switches
4.01 -60.1% 1.60 perf-stat.i.cpi
269394 ± 5% +18.6% 319628 ± 2% perf-stat.i.cpu-migrations
0.19 ± 3% +0.0 0.22 perf-stat.i.dTLB-load-miss-rate%
47856194 ± 3% +19.6% 57235440 perf-stat.i.dTLB-load-misses
3.932e+10 +16.8% 4.592e+10 perf-stat.i.dTLB-loads
2.459e+10 +16.7% 2.871e+10 perf-stat.i.dTLB-stores
72.03 +1.3 73.35 perf-stat.i.iTLB-load-miss-rate%
92906511 ± 2% +16.5% 1.083e+08 ± 3% perf-stat.i.iTLB-load-misses
35482612 +12.6% 39969391 ± 2% perf-stat.i.iTLB-loads
1.521e+11 +17.0% 1.779e+11 perf-stat.i.instructions
28571 ± 3% +14.4% 32688 ± 6% perf-stat.i.instructions-per-iTLB-miss
0.64 +16.9% 0.75 perf-stat.i.ipc
2696 +1.9% 2746 perf-stat.i.minor-faults
115872 ± 4% +33.3% 154453 ± 11% perf-stat.i.node-loads
69.92 ± 5% -6.0 63.95 ± 2% perf-stat.i.node-store-miss-rate%
2696 +1.9% 2746 perf-stat.i.page-faults
1.53 -15.3% 1.30 perf-stat.overall.cpi
0.65 +18.1% 0.77 perf-stat.overall.ipc
3.153e+10 +16.9% 3.684e+10 perf-stat.ps.branch-instructions
4.827e+08 +16.3% 5.616e+08 ± 2% perf-stat.ps.branch-misses
9.972e+08 +18.4% 1.181e+09 perf-stat.ps.cache-references
1778082 +20.3% 2138597 perf-stat.ps.context-switches
2.319e+11 -1.2% 2.292e+11 perf-stat.ps.cpu-cycles
268859 ± 5% +18.4% 318278 ± 2% perf-stat.ps.cpu-migrations
47762824 ± 3% +19.4% 57005870 perf-stat.ps.dTLB-load-misses
3.916e+10 +16.5% 4.563e+10 perf-stat.ps.dTLB-loads
2.449e+10 +16.5% 2.852e+10 perf-stat.ps.dTLB-stores
92569828 ± 2% +16.2% 1.076e+08 ± 3% perf-stat.ps.iTLB-load-misses
35350893 +12.4% 39724157 ± 2% perf-stat.ps.iTLB-loads
1.515e+11 +16.7% 1.767e+11 perf-stat.ps.instructions
2690 +1.6% 2733 perf-stat.ps.minor-faults
115582 ± 4% +33.0% 153669 ± 11% perf-stat.ps.node-loads
2690 +1.6% 2733 perf-stat.ps.page-faults
5.463e+13 -2.2% 5.344e+13 perf-stat.total.instructions
423.00 -16.0% 355.50 interrupts.9:IO-APIC.9-fasteoi.acpi
334285 -13.8% 288189 interrupts.CAL:Function_call_interrupts
3718 ± 5% -11.2% 3301 interrupts.CPU0.CAL:Function_call_interrupts
721763 -16.2% 605183 interrupts.CPU0.LOC:Local_timer_interrupts
25452 ± 12% +29.7% 33009 ± 11% interrupts.CPU0.RES:Rescheduling_interrupts
423.00 -16.0% 355.50 interrupts.CPU1.9:IO-APIC.9-fasteoi.acpi
3841 -13.6% 3317 interrupts.CPU1.CAL:Function_call_interrupts
721450 -16.2% 604748 interrupts.CPU1.LOC:Local_timer_interrupts
3776 ± 2% -12.2% 3317 interrupts.CPU10.CAL:Function_call_interrupts
722124 -16.2% 604931 interrupts.CPU10.LOC:Local_timer_interrupts
22581 ± 13% +44.5% 32630 ± 25% interrupts.CPU10.RES:Rescheduling_interrupts
3835 -13.4% 3321 interrupts.CPU11.CAL:Function_call_interrupts
721997 -16.2% 605063 interrupts.CPU11.LOC:Local_timer_interrupts
21969 ± 12% +39.6% 30662 ± 10% interrupts.CPU11.RES:Rescheduling_interrupts
3828 -12.7% 3340 interrupts.CPU12.CAL:Function_call_interrupts
721760 -16.2% 604828 interrupts.CPU12.LOC:Local_timer_interrupts
3790 -12.8% 3303 interrupts.CPU13.CAL:Function_call_interrupts
721659 -16.2% 604978 interrupts.CPU13.LOC:Local_timer_interrupts
3796 -12.6% 3317 interrupts.CPU14.CAL:Function_call_interrupts
721830 -16.2% 605051 interrupts.CPU14.LOC:Local_timer_interrupts
3745 ± 2% -11.6% 3313 interrupts.CPU15.CAL:Function_call_interrupts
721696 -16.1% 605247 interrupts.CPU15.LOC:Local_timer_interrupts
3796 ± 2% -13.1% 3298 interrupts.CPU16.CAL:Function_call_interrupts
721615 -16.1% 605304 interrupts.CPU16.LOC:Local_timer_interrupts
721686 -16.2% 604977 interrupts.CPU17.LOC:Local_timer_interrupts
23557 ± 11% +31.8% 31037 ± 17% interrupts.CPU17.RES:Rescheduling_interrupts
3841 -13.9% 3308 interrupts.CPU18.CAL:Function_call_interrupts
721620 -16.2% 604958 interrupts.CPU18.LOC:Local_timer_interrupts
22764 ± 17% +33.8% 30451 ± 8% interrupts.CPU18.RES:Rescheduling_interrupts
3790 ± 2% -12.8% 3304 interrupts.CPU19.CAL:Function_call_interrupts
721479 -16.1% 604999 interrupts.CPU19.LOC:Local_timer_interrupts
3845 -14.0% 3307 interrupts.CPU2.CAL:Function_call_interrupts
721284 -16.1% 605094 interrupts.CPU2.LOC:Local_timer_interrupts
22880 ± 19% +42.0% 32484 ± 12% interrupts.CPU2.RES:Rescheduling_interrupts
721360 -16.1% 604946 interrupts.CPU20.LOC:Local_timer_interrupts
21116 ± 18% +39.1% 29366 ± 17% interrupts.CPU20.RES:Rescheduling_interrupts
3811 -13.5% 3297 interrupts.CPU21.CAL:Function_call_interrupts
721473 -16.2% 604796 interrupts.CPU21.LOC:Local_timer_interrupts
3827 -15.3% 3241 ± 3% interrupts.CPU22.CAL:Function_call_interrupts
721210 -16.2% 604237 interrupts.CPU22.LOC:Local_timer_interrupts
721987 -16.2% 604725 interrupts.CPU23.LOC:Local_timer_interrupts
3821 -19.4% 3081 ± 6% interrupts.CPU24.CAL:Function_call_interrupts
721544 -16.2% 604685 interrupts.CPU24.LOC:Local_timer_interrupts
3786 -15.6% 3194 ± 3% interrupts.CPU25.CAL:Function_call_interrupts
721135 -16.1% 604853 interrupts.CPU25.LOC:Local_timer_interrupts
3827 -15.2% 3245 ± 2% interrupts.CPU26.CAL:Function_call_interrupts
721813 -16.2% 605191 interrupts.CPU26.LOC:Local_timer_interrupts
3821 -25.4% 2851 ± 26% interrupts.CPU27.CAL:Function_call_interrupts
721056 -16.1% 605078 interrupts.CPU27.LOC:Local_timer_interrupts
3818 -15.1% 3241 ± 3% interrupts.CPU28.CAL:Function_call_interrupts
721295 -16.1% 604867 interrupts.CPU28.LOC:Local_timer_interrupts
3839 -14.2% 3293 interrupts.CPU29.CAL:Function_call_interrupts
721791 -16.2% 604635 interrupts.CPU29.LOC:Local_timer_interrupts
3818 -13.1% 3318 interrupts.CPU3.CAL:Function_call_interrupts
721555 -16.2% 604385 interrupts.CPU3.LOC:Local_timer_interrupts
22807 ± 12% +32.9% 30316 ± 13% interrupts.CPU3.RES:Rescheduling_interrupts
3829 -14.3% 3280 interrupts.CPU30.CAL:Function_call_interrupts
721877 -16.2% 604644 interrupts.CPU30.LOC:Local_timer_interrupts
29149 ± 5% -15.8% 24552 ± 10% interrupts.CPU30.RES:Rescheduling_interrupts
3838 -13.9% 3304 interrupts.CPU31.CAL:Function_call_interrupts
721927 -16.2% 604849 interrupts.CPU31.LOC:Local_timer_interrupts
3836 -14.2% 3290 interrupts.CPU32.CAL:Function_call_interrupts
720992 -16.1% 604786 interrupts.CPU32.LOC:Local_timer_interrupts
3826 -14.0% 3292 interrupts.CPU33.CAL:Function_call_interrupts
721853 -16.2% 604700 interrupts.CPU33.LOC:Local_timer_interrupts
3809 -13.8% 3283 interrupts.CPU34.CAL:Function_call_interrupts
721742 -16.2% 605038 interrupts.CPU34.LOC:Local_timer_interrupts
3833 -14.0% 3296 interrupts.CPU35.CAL:Function_call_interrupts
721732 -16.2% 604841 interrupts.CPU35.LOC:Local_timer_interrupts
3700 ± 6% -12.7% 3229 ± 3% interrupts.CPU36.CAL:Function_call_interrupts
721598 -16.2% 604849 interrupts.CPU36.LOC:Local_timer_interrupts
3835 -14.1% 3293 interrupts.CPU37.CAL:Function_call_interrupts
721245 -16.2% 604498 interrupts.CPU37.LOC:Local_timer_interrupts
3831 -13.8% 3301 interrupts.CPU38.CAL:Function_call_interrupts
721366 -16.1% 605049 interrupts.CPU38.LOC:Local_timer_interrupts
3843 -15.3% 3253 ± 2% interrupts.CPU39.CAL:Function_call_interrupts
721951 -16.2% 604693 interrupts.CPU39.LOC:Local_timer_interrupts
3823 -13.2% 3316 interrupts.CPU4.CAL:Function_call_interrupts
721523 -16.2% 604953 interrupts.CPU4.LOC:Local_timer_interrupts
22707 ± 16% +33.2% 30241 ± 3% interrupts.CPU4.RES:Rescheduling_interrupts
3840 -14.2% 3294 interrupts.CPU40.CAL:Function_call_interrupts
721636 -16.2% 604810 interrupts.CPU40.LOC:Local_timer_interrupts
3847 -18.8% 3122 ± 9% interrupts.CPU41.CAL:Function_call_interrupts
721928 -16.2% 604884 interrupts.CPU41.LOC:Local_timer_interrupts
3839 -14.3% 3290 interrupts.CPU42.CAL:Function_call_interrupts
721753 -16.2% 604811 interrupts.CPU42.LOC:Local_timer_interrupts
3778 -13.8% 3255 interrupts.CPU43.CAL:Function_call_interrupts
721315 -16.2% 604691 interrupts.CPU43.LOC:Local_timer_interrupts
3841 -13.7% 3314 interrupts.CPU44.CAL:Function_call_interrupts
721306 -16.2% 604640 interrupts.CPU44.LOC:Local_timer_interrupts
22357 ± 16% +47.9% 33078 ± 15% interrupts.CPU44.RES:Rescheduling_interrupts
3832 -15.0% 3256 ± 3% interrupts.CPU45.CAL:Function_call_interrupts
721470 -16.2% 604760 interrupts.CPU45.LOC:Local_timer_interrupts
3846 -18.1% 3149 ± 9% interrupts.CPU46.CAL:Function_call_interrupts
721294 -16.2% 604606 interrupts.CPU46.LOC:Local_timer_interrupts
22430 ± 19% +39.5% 31283 ± 10% interrupts.CPU46.RES:Rescheduling_interrupts
3795 -12.5% 3320 interrupts.CPU47.CAL:Function_call_interrupts
721274 -16.2% 604302 interrupts.CPU47.LOC:Local_timer_interrupts
22255 ± 13% +29.9% 28907 ± 5% interrupts.CPU47.RES:Rescheduling_interrupts
3841 -14.2% 3295 interrupts.CPU48.CAL:Function_call_interrupts
721357 -16.2% 604782 interrupts.CPU48.LOC:Local_timer_interrupts
20959 ± 17% +36.5% 28600 ± 7% interrupts.CPU48.RES:Rescheduling_interrupts
3835 -13.3% 3324 interrupts.CPU49.CAL:Function_call_interrupts
721311 -16.1% 605127 interrupts.CPU49.LOC:Local_timer_interrupts
22089 ± 15% +33.1% 29406 ± 11% interrupts.CPU49.RES:Rescheduling_interrupts
3842 -14.0% 3305 interrupts.CPU5.CAL:Function_call_interrupts
721561 -16.1% 605092 interrupts.CPU5.LOC:Local_timer_interrupts
24272 ± 14% +25.3% 30413 ± 11% interrupts.CPU5.RES:Rescheduling_interrupts
3848 -13.7% 3319 interrupts.CPU50.CAL:Function_call_interrupts
721251 -16.1% 604921 interrupts.CPU50.LOC:Local_timer_interrupts
3827 -13.3% 3318 interrupts.CPU51.CAL:Function_call_interrupts
721474 -16.2% 604842 interrupts.CPU51.LOC:Local_timer_interrupts
3812 -12.5% 3337 interrupts.CPU52.CAL:Function_call_interrupts
721284 -16.1% 604942 interrupts.CPU52.LOC:Local_timer_interrupts
22236 ± 10% +33.3% 29642 ± 8% interrupts.CPU52.RES:Rescheduling_interrupts
721452 -16.1% 605377 interrupts.CPU53.LOC:Local_timer_interrupts
21926 ± 15% +38.2% 30305 ± 12% interrupts.CPU53.RES:Rescheduling_interrupts
721468 -16.1% 605249 interrupts.CPU54.LOC:Local_timer_interrupts
22522 ± 13% +36.4% 30717 ± 17% interrupts.CPU54.RES:Rescheduling_interrupts
3842 -13.9% 3307 interrupts.CPU55.CAL:Function_call_interrupts
721620 -16.2% 604986 interrupts.CPU55.LOC:Local_timer_interrupts
22201 ± 16% +31.7% 29239 ± 11% interrupts.CPU55.RES:Rescheduling_interrupts
3811 -13.3% 3303 interrupts.CPU56.CAL:Function_call_interrupts
721916 -16.2% 605011 interrupts.CPU56.LOC:Local_timer_interrupts
721636 -16.2% 604961 interrupts.CPU57.LOC:Local_timer_interrupts
3841 -13.9% 3309 interrupts.CPU58.CAL:Function_call_interrupts
721637 -16.1% 605154 interrupts.CPU58.LOC:Local_timer_interrupts
3795 -12.7% 3314 interrupts.CPU59.CAL:Function_call_interrupts
721926 -16.1% 605366 interrupts.CPU59.LOC:Local_timer_interrupts
3841 -17.0% 3189 ± 7% interrupts.CPU6.CAL:Function_call_interrupts
721537 -16.1% 605025 interrupts.CPU6.LOC:Local_timer_interrupts
21627 ± 16% +36.9% 29599 ± 15% interrupts.CPU6.RES:Rescheduling_interrupts
3839 -13.5% 3319 interrupts.CPU60.CAL:Function_call_interrupts
721689 -16.2% 604938 interrupts.CPU60.LOC:Local_timer_interrupts
22724 ± 12% +31.4% 29852 ± 13% interrupts.CPU60.RES:Rescheduling_interrupts
721538 -16.1% 605018 interrupts.CPU61.LOC:Local_timer_interrupts
22521 ± 15% +33.2% 29995 ± 19% interrupts.CPU61.RES:Rescheduling_interrupts
3795 ± 2% -12.7% 3313 interrupts.CPU62.CAL:Function_call_interrupts
721565 -16.2% 604946 interrupts.CPU62.LOC:Local_timer_interrupts
22054 ± 17% +38.7% 30598 ± 11% interrupts.CPU62.RES:Rescheduling_interrupts
721884 -16.2% 605030 interrupts.CPU63.LOC:Local_timer_interrupts
3772 ± 3% -12.3% 3309 interrupts.CPU64.CAL:Function_call_interrupts
721981 -16.2% 605040 interrupts.CPU64.LOC:Local_timer_interrupts
21437 ± 20% +33.6% 28639 ± 19% interrupts.CPU64.RES:Rescheduling_interrupts
3779 -19.3% 3051 ± 14% interrupts.CPU65.CAL:Function_call_interrupts
721390 -16.1% 605132 interrupts.CPU65.LOC:Local_timer_interrupts
3825 -13.5% 3309 interrupts.CPU66.CAL:Function_call_interrupts
722064 -16.3% 604606 interrupts.CPU66.LOC:Local_timer_interrupts
3819 -26.5% 2808 ± 24% interrupts.CPU67.CAL:Function_call_interrupts
721439 -16.2% 604744 interrupts.CPU67.LOC:Local_timer_interrupts
3821 -13.5% 3306 interrupts.CPU68.CAL:Function_call_interrupts
722060 -16.3% 604673 interrupts.CPU68.LOC:Local_timer_interrupts
3821 -13.7% 3297 interrupts.CPU69.CAL:Function_call_interrupts
722106 -16.3% 604542 interrupts.CPU69.LOC:Local_timer_interrupts
3842 -14.2% 3297 interrupts.CPU7.CAL:Function_call_interrupts
721363 -16.1% 605295 interrupts.CPU7.LOC:Local_timer_interrupts
22221 ± 16% +40.9% 31317 ± 19% interrupts.CPU7.RES:Rescheduling_interrupts
3826 -13.6% 3305 interrupts.CPU70.CAL:Function_call_interrupts
721764 -16.2% 604627 interrupts.CPU70.LOC:Local_timer_interrupts
3809 -13.5% 3295 interrupts.CPU71.CAL:Function_call_interrupts
722080 -16.2% 604781 interrupts.CPU71.LOC:Local_timer_interrupts
3818 -14.2% 3274 ± 2% interrupts.CPU72.CAL:Function_call_interrupts
721916 -16.2% 604918 interrupts.CPU72.LOC:Local_timer_interrupts
27835 ± 14% -17.6% 22945 ± 10% interrupts.CPU72.RES:Rescheduling_interrupts
3813 -13.2% 3308 interrupts.CPU73.CAL:Function_call_interrupts
721747 -16.2% 604707 interrupts.CPU73.LOC:Local_timer_interrupts
721746 -16.2% 604838 interrupts.CPU74.LOC:Local_timer_interrupts
3832 -14.7% 3269 ± 2% interrupts.CPU75.CAL:Function_call_interrupts
721781 -16.2% 604619 interrupts.CPU75.LOC:Local_timer_interrupts
3779 -12.4% 3310 interrupts.CPU76.CAL:Function_call_interrupts
720970 -16.1% 604618 interrupts.CPU76.LOC:Local_timer_interrupts
3809 ± 2% -15.7% 3211 ± 4% interrupts.CPU77.CAL:Function_call_interrupts
722062 -16.3% 604188 interrupts.CPU77.LOC:Local_timer_interrupts
3843 -14.6% 3284 ± 2% interrupts.CPU78.CAL:Function_call_interrupts
721715 -16.2% 604636 interrupts.CPU78.LOC:Local_timer_interrupts
3840 -13.9% 3306 interrupts.CPU79.CAL:Function_call_interrupts
721613 -16.2% 604526 interrupts.CPU79.LOC:Local_timer_interrupts
3763 ± 4% -11.5% 3331 interrupts.CPU8.CAL:Function_call_interrupts
721367 -16.1% 605107 interrupts.CPU8.LOC:Local_timer_interrupts
22471 ± 17% +38.2% 31048 ± 9% interrupts.CPU8.RES:Rescheduling_interrupts
3830 -13.9% 3298 interrupts.CPU80.CAL:Function_call_interrupts
721678 -16.2% 604714 interrupts.CPU80.LOC:Local_timer_interrupts
3776 ± 3% -12.5% 3306 interrupts.CPU81.CAL:Function_call_interrupts
721263 -16.1% 604803 interrupts.CPU81.LOC:Local_timer_interrupts
3837 -13.9% 3305 interrupts.CPU82.CAL:Function_call_interrupts
721961 -16.2% 604781 interrupts.CPU82.LOC:Local_timer_interrupts
3849 -14.8% 3280 interrupts.CPU83.CAL:Function_call_interrupts
721888 -16.2% 604619 interrupts.CPU83.LOC:Local_timer_interrupts
3802 -13.2% 3300 interrupts.CPU84.CAL:Function_call_interrupts
721937 -16.2% 604653 interrupts.CPU84.LOC:Local_timer_interrupts
3843 -13.8% 3313 interrupts.CPU85.CAL:Function_call_interrupts
721834 -16.2% 604711 interrupts.CPU85.LOC:Local_timer_interrupts
3839 -14.3% 3291 interrupts.CPU86.CAL:Function_call_interrupts
721871 -16.2% 604809 interrupts.CPU86.LOC:Local_timer_interrupts
3847 -14.6% 3287 interrupts.CPU87.CAL:Function_call_interrupts
722149 -16.3% 604789 interrupts.CPU87.LOC:Local_timer_interrupts
3838 -14.1% 3298 interrupts.CPU9.CAL:Function_call_interrupts
721253 -16.2% 604595 interrupts.CPU9.LOC:Local_timer_interrupts
22535 ± 19% +36.1% 30680 ± 11% interrupts.CPU9.RES:Rescheduling_interrupts
63503101 -16.2% 53227032 interrupts.LOC:Local_timer_interrupts
310.00 ± 3% +11.0% 344.00 ± 7% interrupts.TLB:TLB_shootdowns
20431 ± 15% -19.7% 16415 ± 4% softirqs.CPU0.RCU
34582 ± 2% +17.5% 40639 ± 8% softirqs.CPU0.SCHED
133704 -12.9% 116517 ± 3% softirqs.CPU0.TIMER
20017 ± 16% -18.6% 16300 softirqs.CPU1.RCU
32010 ± 3% +17.3% 37561 ± 8% softirqs.CPU1.SCHED
132385 ± 2% -14.8% 112812 softirqs.CPU1.TIMER
19743 ± 16% -20.2% 15764 ± 3% softirqs.CPU10.RCU
30881 ± 3% +20.6% 37253 ± 10% softirqs.CPU10.SCHED
130801 ± 2% -8.5% 119711 ± 6% softirqs.CPU10.TIMER
30607 ± 2% +20.4% 36850 ± 8% softirqs.CPU11.SCHED
133304 ± 2% -11.9% 117463 ± 2% softirqs.CPU11.TIMER
30848 +20.1% 37052 ± 9% softirqs.CPU12.SCHED
131208 -11.5% 116174 ± 3% softirqs.CPU12.TIMER
22042 ± 17% -28.7% 15719 ± 5% softirqs.CPU13.RCU
30520 ± 5% +21.7% 37133 ± 11% softirqs.CPU13.SCHED
135598 ± 4% -15.4% 114708 ± 2% softirqs.CPU13.TIMER
31132 ± 4% +15.4% 35911 ± 10% softirqs.CPU14.SCHED
148384 ± 17% -22.8% 114512 softirqs.CPU14.TIMER
30303 +20.6% 36560 ± 10% softirqs.CPU15.SCHED
133852 -14.1% 115008 softirqs.CPU15.TIMER
31199 ± 2% +18.5% 36961 ± 8% softirqs.CPU16.SCHED
133767 ± 2% -13.1% 116191 softirqs.CPU16.TIMER
19872 ± 17% -17.2% 16453 ± 2% softirqs.CPU17.RCU
30661 ± 2% +19.2% 36554 ± 9% softirqs.CPU17.SCHED
133274 ± 2% -15.1% 113122 ± 2% softirqs.CPU17.TIMER
20088 ± 15% -17.5% 16582 ± 2% softirqs.CPU18.RCU
31180 ± 3% +15.6% 36055 ± 9% softirqs.CPU18.SCHED
134796 ± 4% -15.7% 113667 softirqs.CPU18.TIMER
22481 ± 20% -26.8% 16451 ± 2% softirqs.CPU19.RCU
30809 ± 2% +19.5% 36811 ± 7% softirqs.CPU19.SCHED
132271 ± 2% -12.4% 115843 ± 3% softirqs.CPU19.TIMER
19797 ± 16% -17.5% 16326 ± 2% softirqs.CPU2.RCU
30922 ± 3% +21.6% 37612 ± 8% softirqs.CPU2.SCHED
25248 ± 29% -24.3% 19119 ± 24% softirqs.CPU20.RCU
29985 ± 2% +22.1% 36608 ± 8% softirqs.CPU20.SCHED
128772 ± 2% -10.9% 114692 softirqs.CPU20.TIMER
19935 ± 17% -19.2% 16110 ± 3% softirqs.CPU21.RCU
31495 ± 3% +16.4% 36651 ± 7% softirqs.CPU21.SCHED
130837 ± 3% -13.2% 113573 ± 2% softirqs.CPU21.TIMER
40257 ± 5% -13.7% 34739 ± 7% softirqs.CPU22.SCHED
133177 ± 2% -16.6% 111009 ± 3% softirqs.CPU22.TIMER
40524 ± 5% -17.6% 33383 ± 9% softirqs.CPU23.SCHED
133547 ± 4% -18.1% 109369 softirqs.CPU23.TIMER
39849 ± 3% -15.3% 33755 ± 9% softirqs.CPU24.SCHED
131526 ± 2% -17.7% 108249 ± 2% softirqs.CPU24.TIMER
39996 ± 4% -13.6% 34576 ± 6% softirqs.CPU25.SCHED
131627 ± 3% -16.1% 110442 ± 3% softirqs.CPU25.TIMER
39898 ± 5% -14.9% 33947 ± 7% softirqs.CPU26.SCHED
132681 -16.5% 110841 ± 4% softirqs.CPU26.TIMER
40079 ± 7% -15.6% 33826 ± 6% softirqs.CPU27.SCHED
133577 -18.1% 109462 ± 2% softirqs.CPU27.TIMER
40228 ± 5% -15.4% 34036 ± 6% softirqs.CPU28.SCHED
131075 ± 3% -17.9% 107666 ± 2% softirqs.CPU28.TIMER
39312 ± 3% -14.8% 33499 ± 8% softirqs.CPU29.SCHED
134423 ± 3% -18.4% 109702 ± 3% softirqs.CPU29.TIMER
19384 ± 17% -17.8% 15929 softirqs.CPU3.RCU
30436 ± 3% +21.2% 36897 ± 11% softirqs.CPU3.SCHED
132319 ± 3% -13.4% 114525 ± 2% softirqs.CPU3.TIMER
22908 ± 16% -22.4% 17781 ± 3% softirqs.CPU30.RCU
39762 ± 3% -15.0% 33803 ± 8% softirqs.CPU30.SCHED
134269 ± 2% -17.7% 110531 ± 2% softirqs.CPU30.TIMER
41024 ± 5% -17.4% 33894 ± 7% softirqs.CPU31.SCHED
149024 ± 17% -24.6% 112375 ± 3% softirqs.CPU31.TIMER
39978 ± 4% -15.5% 33774 ± 7% softirqs.CPU32.SCHED
131148 -17.2% 108564 ± 3% softirqs.CPU32.TIMER
39979 ± 4% -13.8% 34476 ± 9% softirqs.CPU33.SCHED
134094 ± 4% -18.1% 109836 ± 2% softirqs.CPU33.TIMER
22786 ± 16% -22.9% 17569 ± 4% softirqs.CPU34.RCU
40552 ± 4% -17.3% 33554 ± 7% softirqs.CPU34.SCHED
131088 ± 2% -17.0% 108798 ± 3% softirqs.CPU34.TIMER
39532 ± 3% -14.1% 33963 ± 7% softirqs.CPU35.SCHED
132211 ± 2% -15.9% 111137 ± 3% softirqs.CPU35.TIMER
38671 ± 4% -13.2% 33557 ± 9% softirqs.CPU36.SCHED
147868 ± 17% -27.0% 107982 softirqs.CPU36.TIMER
23249 ± 15% -22.1% 18100 ± 5% softirqs.CPU37.RCU
40190 ± 5% -15.3% 34023 ± 7% softirqs.CPU37.SCHED
132884 ± 3% -15.9% 111703 ± 2% softirqs.CPU37.TIMER
22745 ± 16% -23.2% 17469 ± 4% softirqs.CPU38.RCU
40569 ± 5% -16.3% 33950 ± 9% softirqs.CPU38.SCHED
135133 -18.3% 110441 ± 2% softirqs.CPU38.TIMER
39935 ± 5% -15.3% 33825 ± 8% softirqs.CPU39.SCHED
133753 ± 2% -17.5% 110412 softirqs.CPU39.TIMER
30946 ± 4% +17.7% 36409 ± 9% softirqs.CPU4.SCHED
130762 ± 2% -13.6% 113024 ± 2% softirqs.CPU4.TIMER
40335 ± 3% -16.4% 33703 ± 9% softirqs.CPU40.SCHED
130983 ± 2% -15.6% 110597 ± 2% softirqs.CPU40.TIMER
40423 ± 5% -17.1% 33525 ± 8% softirqs.CPU41.SCHED
132224 ± 4% -18.0% 108363 softirqs.CPU41.TIMER
40723 ± 4% -17.3% 33682 ± 7% softirqs.CPU42.SCHED
135049 -17.8% 110952 ± 3% softirqs.CPU42.TIMER
39349 ± 3% -14.8% 33536 ± 8% softirqs.CPU43.SCHED
134203 ± 2% -17.6% 110610 ± 2% softirqs.CPU43.TIMER
17084 ± 18% -18.0% 14000 ± 3% softirqs.CPU44.RCU
30946 +22.1% 37787 ± 9% softirqs.CPU44.SCHED
130568 -13.3% 113197 ± 3% softirqs.CPU44.TIMER
20252 ± 15% -19.8% 16238 ± 2% softirqs.CPU45.RCU
31638 ± 4% +16.3% 36780 ± 9% softirqs.CPU45.SCHED
132331 ± 2% -14.9% 112550 softirqs.CPU45.TIMER
24973 ± 13% -16.0% 20983 ± 2% softirqs.CPU46.RCU
30862 +22.1% 37672 ± 6% softirqs.CPU46.SCHED
137074 ± 3% -11.2% 121691 ± 4% softirqs.CPU46.TIMER
19750 ± 17% -18.6% 16079 ± 2% softirqs.CPU47.RCU
30289 ± 3% +21.4% 36762 ± 9% softirqs.CPU47.SCHED
130548 ± 3% -12.7% 113914 softirqs.CPU47.TIMER
19693 ± 16% -17.0% 16336 ± 4% softirqs.CPU48.RCU
30593 ± 5% +19.9% 36684 ± 9% softirqs.CPU48.SCHED
130499 ± 3% -13.6% 112753 ± 2% softirqs.CPU48.TIMER
30877 ± 2% +19.9% 37018 ± 9% softirqs.CPU49.SCHED
130875 ± 4% -13.0% 113833 softirqs.CPU49.TIMER
22365 ± 36% -27.9% 16131 softirqs.CPU5.RCU
30822 +20.2% 37048 ± 9% softirqs.CPU5.SCHED
130397 ± 3% -11.8% 115024 softirqs.CPU5.TIMER
22536 ± 37% -26.9% 16472 ± 4% softirqs.CPU50.RCU
30543 +20.3% 36752 ± 9% softirqs.CPU50.SCHED
133918 ± 3% -13.5% 115860 ± 4% softirqs.CPU50.TIMER
131702 ± 2% -9.6% 118995 ± 6% softirqs.CPU51.TIMER
20176 ± 17% -21.1% 15915 ± 3% softirqs.CPU52.RCU
30631 ± 3% +22.2% 37419 ± 7% softirqs.CPU52.SCHED
131629 -13.5% 113828 softirqs.CPU52.TIMER
31078 +21.4% 37736 ± 9% softirqs.CPU53.SCHED
133106 ± 3% -14.2% 114257 ± 3% softirqs.CPU53.TIMER
30982 ± 4% +19.3% 36962 ± 9% softirqs.CPU54.SCHED
25187 ± 31% -24.8% 18932 ± 27% softirqs.CPU55.RCU
30622 ± 3% +19.6% 36613 ± 9% softirqs.CPU55.SCHED
19796 ± 14% -18.3% 16167 ± 3% softirqs.CPU56.RCU
30828 +18.8% 36619 ± 9% softirqs.CPU56.SCHED
130194 ± 2% -12.1% 114503 ± 2% softirqs.CPU56.TIMER
30577 ± 5% +21.1% 37016 ± 10% softirqs.CPU57.SCHED
147818 ± 17% -24.6% 111515 ± 2% softirqs.CPU57.TIMER
25343 ± 33% -36.6% 16078 ± 4% softirqs.CPU58.RCU
31181 ± 2% +16.6% 36349 ± 10% softirqs.CPU58.SCHED
139070 ± 7% -16.4% 116283 ± 2% softirqs.CPU58.TIMER
19741 ± 16% -15.3% 16722 ± 5% softirqs.CPU59.RCU
30636 ± 2% +20.1% 36788 ± 9% softirqs.CPU59.SCHED
133696 -14.0% 114933 softirqs.CPU59.TIMER
30639 ± 2% +19.3% 36549 ± 9% softirqs.CPU6.SCHED
31080 +19.0% 36975 ± 7% softirqs.CPU60.SCHED
133580 ± 3% -13.5% 115606 softirqs.CPU60.TIMER
19131 ± 16% -16.9% 15898 ± 2% softirqs.CPU61.RCU
30495 ± 3% +20.2% 36664 ± 10% softirqs.CPU61.SCHED
133103 -15.3% 112675 softirqs.CPU61.TIMER
31019 ± 2% +18.8% 36849 ± 8% softirqs.CPU62.SCHED
134432 ± 4% -15.8% 113144 softirqs.CPU62.TIMER
30882 ± 2% +16.9% 36105 ± 8% softirqs.CPU63.SCHED
131772 ± 2% -12.7% 115005 ± 3% softirqs.CPU63.TIMER
19316 ± 16% -18.8% 15694 ± 4% softirqs.CPU64.RCU
30286 ± 2% +20.5% 36497 ± 8% softirqs.CPU64.SCHED
128582 ± 2% -10.1% 115656 ± 4% softirqs.CPU64.TIMER
19223 ± 17% -18.1% 15749 ± 3% softirqs.CPU65.RCU
30985 ± 2% +18.6% 36751 ± 8% softirqs.CPU65.SCHED
130355 ± 3% -13.0% 113407 ± 2% softirqs.CPU65.TIMER
40027 ± 4% -14.7% 34153 ± 7% softirqs.CPU66.SCHED
131990 ± 3% -16.9% 109630 ± 3% softirqs.CPU66.TIMER
40103 ± 5% -16.4% 33528 ± 7% softirqs.CPU67.SCHED
133065 ± 4% -18.5% 108445 softirqs.CPU67.TIMER
39282 -14.5% 33576 ± 9% softirqs.CPU68.SCHED
131993 ± 2% -18.9% 107006 ± 2% softirqs.CPU68.TIMER
131342 ± 3% -16.3% 109974 ± 3% softirqs.CPU69.TIMER
19684 ± 18% -16.4% 16446 ± 4% softirqs.CPU7.RCU
30804 ± 2% +19.7% 36877 ± 11% softirqs.CPU7.SCHED
39499 ± 4% -14.1% 33939 ± 7% softirqs.CPU70.SCHED
132374 -17.3% 109508 ± 4% softirqs.CPU70.TIMER
39816 ± 4% -13.6% 34387 ± 6% softirqs.CPU71.SCHED
133540 -18.0% 109519 ± 2% softirqs.CPU71.TIMER
40073 ± 4% -15.8% 33726 ± 6% softirqs.CPU72.SCHED
132172 ± 4% -18.5% 107744 softirqs.CPU72.TIMER
39089 ± 3% -14.1% 33596 ± 7% softirqs.CPU73.SCHED
131584 -16.4% 110040 ± 3% softirqs.CPU73.TIMER
40315 ± 4% -16.5% 33655 ± 7% softirqs.CPU74.SCHED
133455 -17.6% 109941 softirqs.CPU74.TIMER
40427 ± 5% -16.7% 33659 ± 8% softirqs.CPU75.SCHED
137065 ± 5% -18.6% 111601 ± 3% softirqs.CPU75.TIMER
39577 ± 3% -14.0% 34031 ± 7% softirqs.CPU76.SCHED
130476 -16.6% 108755 ± 4% softirqs.CPU76.TIMER
40386 ± 5% -14.9% 34356 ± 7% softirqs.CPU77.SCHED
134414 ± 4% -18.8% 109199 ± 3% softirqs.CPU77.TIMER
40179 ± 5% -16.2% 33656 ± 7% softirqs.CPU78.SCHED
131087 ± 2% -17.4% 108250 ± 3% softirqs.CPU78.TIMER
39263 ± 3% -14.2% 33686 ± 7% softirqs.CPU79.SCHED
130805 ± 2% -16.3% 109502 ± 2% softirqs.CPU79.TIMER
19896 ± 16% -19.7% 15987 ± 2% softirqs.CPU8.RCU
30563 ± 3% +24.0% 37906 ± 7% softirqs.CPU8.SCHED
131375 ± 2% -12.8% 114531 softirqs.CPU8.TIMER
39276 ± 2% -14.9% 33416 ± 9% softirqs.CPU80.SCHED
138042 ± 7% -21.9% 107876 ± 2% softirqs.CPU80.TIMER
21626 ± 17% -24.1% 16414 ± 4% softirqs.CPU81.RCU
40018 ± 4% -15.6% 33781 ± 7% softirqs.CPU81.SCHED
134117 ± 4% -16.8% 111583 ± 3% softirqs.CPU81.TIMER
134747 -18.0% 110540 ± 3% softirqs.CPU82.TIMER
133707 ± 2% -17.6% 110191 softirqs.CPU83.TIMER
40079 ± 4% -16.3% 33556 ± 10% softirqs.CPU84.SCHED
130536 ± 2% -15.8% 109959 softirqs.CPU84.TIMER
40427 ± 5% -16.9% 33585 ± 7% softirqs.CPU85.SCHED
131805 ± 4% -18.4% 107549 softirqs.CPU85.TIMER
40785 ± 5% -17.5% 33639 ± 7% softirqs.CPU86.SCHED
134917 ± 2% -18.1% 110473 ± 3% softirqs.CPU86.TIMER
39496 ± 3% -15.7% 33280 ± 8% softirqs.CPU87.SCHED
135851 ± 5% -19.3% 109698 ± 2% softirqs.CPU87.TIMER
30947 ± 2% +20.1% 37173 ± 9% softirqs.CPU9.SCHED
133360 ± 3% -14.3% 114339 ± 3% softirqs.CPU9.TIMER
4299 ±103% -66.6% 1438 ± 9% softirqs.NET_RX
1788927 ± 16% -15.2% 1516780 ± 4% softirqs.RCU
11741892 -15.5% 9927286 softirqs.TIMER





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


Attachments:
(No filename) (168.06 kB)
config-5.0.0-rc4-00004-gcdaa8132 (171.84 kB)
job-script (7.28 kB)
job.yaml (4.95 kB)
reproduce (263.00 B)
Download all attachments