2021-03-15 17:50:30

by Kees Cook

[permalink] [raw]
Subject: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

The sysfs interface to seq_file continues to be rather fragile, as seen
with some recent exploits[1]. Move the seq_file buffer to the vmap area
(while retaining the accounting flag), since it has guard pages that
will catch and stop linear overflows. This seems justified given that
seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
larger allocation, has allocations are normally short lived, and is not
normally on a performance critical path.

[1] https://blog.grimm-co.com/2021/03/new-old-bugs-in-linux-kernel.html

Signed-off-by: Kees Cook <[email protected]>
---
fs/seq_file.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index cb11a34fb871..16fb4a4e61e3 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -32,7 +32,12 @@ static void seq_set_overflow(struct seq_file *m)

static void *seq_buf_alloc(unsigned long size)
{
- return kvmalloc(size, GFP_KERNEL_ACCOUNT);
+ /*
+ * To be proactively defensive against buggy seq_get_buf() callers
+ * (i.e. sysfs handlers), use the vmap area to gain the trailing
+ * guard page which will protect against linear buffer overflows.
+ */
+ return __vmalloc(size, GFP_KERNEL_ACCOUNT);
}

/**
@@ -130,7 +135,7 @@ static int traverse(struct seq_file *m, loff_t offset)

Eoverflow:
m->op->stop(m, p);
- kvfree(m->buf);
+ vfree(m->buf);
m->count = 0;
m->buf = seq_buf_alloc(m->size <<= 1);
return !m->buf ? -ENOMEM : -EAGAIN;
@@ -237,7 +242,7 @@ ssize_t seq_read_iter(struct kiocb *iocb, struct iov_iter *iter)
goto Fill;
// need a bigger buffer
m->op->stop(m, p);
- kvfree(m->buf);
+ vfree(m->buf);
m->count = 0;
m->buf = seq_buf_alloc(m->size <<= 1);
if (!m->buf)
@@ -349,7 +354,7 @@ EXPORT_SYMBOL(seq_lseek);
int seq_release(struct inode *inode, struct file *file)
{
struct seq_file *m = file->private_data;
- kvfree(m->buf);
+ vfree(m->buf);
kmem_cache_free(seq_file_cache, m);
return 0;
}
@@ -585,7 +590,7 @@ int single_open_size(struct file *file, int (*show)(struct seq_file *, void *),
return -ENOMEM;
ret = single_open(file, show, data);
if (ret) {
- kvfree(buf);
+ vfree(buf);
return ret;
}
((struct seq_file *)file->private_data)->buf = buf;
--
2.25.1


2021-03-15 21:54:57

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Mon, Mar 15, 2021 at 06:33:10PM +0000, Al Viro wrote:
> On Mon, Mar 15, 2021 at 10:48:51AM -0700, Kees Cook wrote:
> > The sysfs interface to seq_file continues to be rather fragile, as seen
> > with some recent exploits[1]. Move the seq_file buffer to the vmap area
> > (while retaining the accounting flag), since it has guard pages that
> > will catch and stop linear overflows. This seems justified given that
> > seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
> > larger allocation, has allocations are normally short lived, and is not
> > normally on a performance critical path.
>
> You are attacking the wrong part of it. Is there any reason for having
> seq_get_buf() public in the first place?

Completely agreed. seq_get_buf() should be totally ripped out.
Unfortunately, this is going to be a long road because of sysfs's ATTR
stuff, there are something like 5000 callers, and the entire API was
designed to avoid refactoring all those callers from
sysfs_kf_seq_show().

However, since I also need to entirely rewrite the sysfs vs kobj APIs[1]
for CFI, I'm working on a plan to fix it all at once, but based on my
experience refactoring the timer struct, it's going to be a very painful
and long road.

So, in the meantime, I'd like to make this change so we can get bounds
checking for free on seq_file (since it's almost always PAGE_SIZE
anyway).

-Kees

[1] https://lore.kernel.org/lkml/202006112217.2E6CE093@keescook/

--
Kees Cook

2021-03-15 23:46:17

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Mon, Mar 15, 2021 at 10:48:51AM -0700, Kees Cook wrote:
> The sysfs interface to seq_file continues to be rather fragile, as seen
> with some recent exploits[1]. Move the seq_file buffer to the vmap area
> (while retaining the accounting flag), since it has guard pages that
> will catch and stop linear overflows. This seems justified given that
> seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
> larger allocation, has allocations are normally short lived, and is not
> normally on a performance critical path.

You are attacking the wrong part of it. Is there any reason for having
seq_get_buf() public in the first place?

For example, the use in blkcg_print_stat() is entirely due to the bogus
->pd_stat_fn() calling conventions. Fuck scnprintf() games, just pass
seq_file to ->pd_stat_fn() and use seq_printf() instead. Voila - no
seq_get_buf()/seq_commit()/scnprintf() garbage.

tegra use is no better, AFAICS. inifinibarf one... allow me to quote
that gem in full:
static int _driver_stats_seq_show(struct seq_file *s, void *v)
{
loff_t *spos = v;
char *buffer;
u64 *stats = (u64 *)&hfi1_stats;
size_t sz = seq_get_buf(s, &buffer);

if (sz < sizeof(u64))
return SEQ_SKIP;
/* special case for interrupts */
if (*spos == 0)
*(u64 *)buffer = hfi1_sps_ints();
else
*(u64 *)buffer = stats[*spos];
seq_commit(s, sizeof(u64));
return 0;
}
Yes, really. Not to mention that there's seq_write(), what the _hell_
is it using seq_file for in the first place? Oh, and hfi_stats is
actually this:
struct hfi1_ib_stats {
__u64 sps_ints; /* number of interrupts handled */
__u64 sps_errints; /* number of error interrupts */
__u64 sps_txerrs; /* tx-related packet errors */
__u64 sps_rcverrs; /* non-crc rcv packet errors */
__u64 sps_hwerrs; /* hardware errors reported (parity, etc.) */
__u64 sps_nopiobufs; /* no pio bufs avail from kernel */
__u64 sps_ctxts; /* number of contexts currently open */
__u64 sps_lenerrs; /* number of kernel packets where RHF != LRH len */
__u64 sps_buffull;
__u64 sps_hdrfull;
};
I won't go into further details - CDA might be dead and buried, but there
should be some limit to public obscenity ;-/

procfs use is borderline - it looks like there might be a good cause
for seq_escape_str().

And sysfs_kf_seq_show()... Do we want to go through seq_file there at
all?

2021-03-16 09:39:08

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Mon 15-03-21 10:48:51, Kees Cook wrote:
> The sysfs interface to seq_file continues to be rather fragile, as seen
> with some recent exploits[1]. Move the seq_file buffer to the vmap area
> (while retaining the accounting flag), since it has guard pages that
> will catch and stop linear overflows. This seems justified given that
> seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
> larger allocation, has allocations are normally short lived, and is not
> normally on a performance critical path.

I have already objected without having my concerns really addressed.

Your observation that most of buffers are PAGE_SIZE in the vast majority
cases matches my experience and kmalloc should perform better than
vmalloc. You should check the most common /proc readers at least.

Also this cannot really be done for configurations with a very limited
vmalloc space (32b for example). Those systems are more and more rare
but you shouldn't really allow userspace to deplete the vmalloc space.

I would be also curious to see how vmalloc scales with huge number of
single page allocations which would be easy to trigger with this patch.

> [1] https://blog.grimm-co.com/2021/03/new-old-bugs-in-linux-kernel.html
>
> Signed-off-by: Kees Cook <[email protected]>
> ---
> fs/seq_file.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/seq_file.c b/fs/seq_file.c
> index cb11a34fb871..16fb4a4e61e3 100644
> --- a/fs/seq_file.c
> +++ b/fs/seq_file.c
> @@ -32,7 +32,12 @@ static void seq_set_overflow(struct seq_file *m)
>
> static void *seq_buf_alloc(unsigned long size)
> {
> - return kvmalloc(size, GFP_KERNEL_ACCOUNT);
> + /*
> + * To be proactively defensive against buggy seq_get_buf() callers
> + * (i.e. sysfs handlers), use the vmap area to gain the trailing
> + * guard page which will protect against linear buffer overflows.
> + */
> + return __vmalloc(size, GFP_KERNEL_ACCOUNT);
> }
>
> /**
> @@ -130,7 +135,7 @@ static int traverse(struct seq_file *m, loff_t offset)
>
> Eoverflow:
> m->op->stop(m, p);
> - kvfree(m->buf);
> + vfree(m->buf);
> m->count = 0;
> m->buf = seq_buf_alloc(m->size <<= 1);
> return !m->buf ? -ENOMEM : -EAGAIN;
> @@ -237,7 +242,7 @@ ssize_t seq_read_iter(struct kiocb *iocb, struct iov_iter *iter)
> goto Fill;
> // need a bigger buffer
> m->op->stop(m, p);
> - kvfree(m->buf);
> + vfree(m->buf);
> m->count = 0;
> m->buf = seq_buf_alloc(m->size <<= 1);
> if (!m->buf)
> @@ -349,7 +354,7 @@ EXPORT_SYMBOL(seq_lseek);
> int seq_release(struct inode *inode, struct file *file)
> {
> struct seq_file *m = file->private_data;
> - kvfree(m->buf);
> + vfree(m->buf);
> kmem_cache_free(seq_file_cache, m);
> return 0;
> }
> @@ -585,7 +590,7 @@ int single_open_size(struct file *file, int (*show)(struct seq_file *, void *),
> return -ENOMEM;
> ret = single_open(file, show, data);
> if (ret) {
> - kvfree(buf);
> + vfree(buf);
> return ret;
> }
> ((struct seq_file *)file->private_data)->buf = buf;
> --
> 2.25.1

--
Michal Hocko
SUSE Labs

2021-03-16 13:20:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Mon, Mar 15, 2021 at 01:43:59PM -0700, Kees Cook wrote:
> On Mon, Mar 15, 2021 at 06:33:10PM +0000, Al Viro wrote:
> > On Mon, Mar 15, 2021 at 10:48:51AM -0700, Kees Cook wrote:
> > > The sysfs interface to seq_file continues to be rather fragile, as seen
> > > with some recent exploits[1]. Move the seq_file buffer to the vmap area
> > > (while retaining the accounting flag), since it has guard pages that
> > > will catch and stop linear overflows. This seems justified given that
> > > seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
> > > larger allocation, has allocations are normally short lived, and is not
> > > normally on a performance critical path.
> >
> > You are attacking the wrong part of it. Is there any reason for having
> > seq_get_buf() public in the first place?
>
> Completely agreed. seq_get_buf() should be totally ripped out.
> Unfortunately, this is going to be a long road because of sysfs's ATTR
> stuff, there are something like 5000 callers, and the entire API was
> designed to avoid refactoring all those callers from
> sysfs_kf_seq_show().

What is wrong with the sysfs ATTR stuff? That should make it so that we
do not have to change any caller for any specific change like this, why
can't sysfs or kernfs handle it automatically?

> However, since I also need to entirely rewrite the sysfs vs kobj APIs[1]
> for CFI, I'm working on a plan to fix it all at once, but based on my
> experience refactoring the timer struct, it's going to be a very painful
> and long road.

Oh yeah, that fun. I don't think it's going to be as hard as you think,
as the underlying code is doing the "right thing" here, so this feels
like a problem in the CFI implementation more than anything else.

So what can I do today in sysfs to help fix the seq_get_buf() stuff?
What should it use instead?

thanks,

greg k-h

2021-03-16 19:59:47

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue, Mar 16, 2021 at 08:24:50AM +0100, Greg Kroah-Hartman wrote:

> > Completely agreed. seq_get_buf() should be totally ripped out.
> > Unfortunately, this is going to be a long road because of sysfs's ATTR
> > stuff, there are something like 5000 callers, and the entire API was
> > designed to avoid refactoring all those callers from
> > sysfs_kf_seq_show().
>
> What is wrong with the sysfs ATTR stuff? That should make it so that we
> do not have to change any caller for any specific change like this, why
> can't sysfs or kernfs handle it automatically?

Hard to tell, since that would require _finding_ the sodding ->show()
instances first. Good luck with that, seeing that most of those appear
to come from templates-done-with-cpp...

AFAICS, Kees wants to protect against ->show() instances stomping beyond
the page size. What I don't get is what do you get from using seq_file
if you insist on doing raw access to the buffer rather than using
seq_printf() and friends. What's the point?

2021-03-16 20:10:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue, Mar 16, 2021 at 12:43:12PM +0000, Al Viro wrote:
> On Tue, Mar 16, 2021 at 08:24:50AM +0100, Greg Kroah-Hartman wrote:
>
> > > Completely agreed. seq_get_buf() should be totally ripped out.
> > > Unfortunately, this is going to be a long road because of sysfs's ATTR
> > > stuff, there are something like 5000 callers, and the entire API was
> > > designed to avoid refactoring all those callers from
> > > sysfs_kf_seq_show().
> >
> > What is wrong with the sysfs ATTR stuff? That should make it so that we
> > do not have to change any caller for any specific change like this, why
> > can't sysfs or kernfs handle it automatically?
>
> Hard to tell, since that would require _finding_ the sodding ->show()
> instances first. Good luck with that, seeing that most of those appear
> to come from templates-done-with-cpp...

Sure, auditing all of this is a pain, but the numbers that take a string
are low if someone wants to do that and convert them all to sysfs_emit()
today.

> AFAICS, Kees wants to protect against ->show() instances stomping beyond
> the page size. What I don't get is what do you get from using seq_file
> if you insist on doing raw access to the buffer rather than using
> seq_printf() and friends. What's the point?

I don't understand as I didn't switch kernfs to this api at all anyway,
as it seems to have come from the original sysfs code moving to kernfs
way back in 2013 with the work that Tejun did. So I can't remember any
of that...

thanks,

greg k-h

2021-03-16 20:16:18

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue 16-03-21 12:43:12, Al Viro wrote:
[...]
> AFAICS, Kees wants to protect against ->show() instances stomping beyond
> the page size. What I don't get is what do you get from using seq_file
> if you insist on doing raw access to the buffer rather than using
> seq_printf() and friends. What's the point?

I do not think there is any and as you have said in other response we
should really make seq_get_buf internal thing to seq_file and be done
with that. If there is a missing functionality that users workaround by
abusing seq_get_buf then it should be added into seq_file interface.
--
Michal Hocko
SUSE Labs

2021-03-16 21:22:01

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue, Mar 16, 2021 at 09:31:23AM +0100, Michal Hocko wrote:
> On Mon 15-03-21 10:48:51, Kees Cook wrote:
> > The sysfs interface to seq_file continues to be rather fragile, as seen
> > with some recent exploits[1]. Move the seq_file buffer to the vmap area
> > (while retaining the accounting flag), since it has guard pages that
> > will catch and stop linear overflows. This seems justified given that
> > seq_file already uses kvmalloc(), is almost always using a PAGE_SIZE or
> > larger allocation, has allocations are normally short lived, and is not
> > normally on a performance critical path.
>
> I have already objected without having my concerns really addressed.

Sorry, I didn't mean to ignore your comments!

> Your observation that most of buffers are PAGE_SIZE in the vast majority
> cases matches my experience and kmalloc should perform better than
> vmalloc. You should check the most common /proc readers at least.

Yeah, I'm going to build a quick test rig to see some before/after
timings, etc.

> Also this cannot really be done for configurations with a very limited
> vmalloc space (32b for example). Those systems are more and more rare
> but you shouldn't really allow userspace to deplete the vmalloc space.

This sounds like two objections:
- 32b has a small vmalloc space
- userspace shouldn't allow depletion of vmalloc space

I'd be happy to make this 64b only. For the latter, I would imagine
there are other vmalloc-exposed-to-userspace cases, but yes, this would
be much more direct. Is that a problem in practice?

> I would be also curious to see how vmalloc scales with huge number of
> single page allocations which would be easy to trigger with this patch.

Right -- what the best way to measure this (and what would be "too
much")?

--
Kees Cook

2021-03-16 21:22:41

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue, Mar 16, 2021 at 12:43:12PM +0000, Al Viro wrote:
> On Tue, Mar 16, 2021 at 08:24:50AM +0100, Greg Kroah-Hartman wrote:
>
> > > Completely agreed. seq_get_buf() should be totally ripped out.
> > > Unfortunately, this is going to be a long road because of sysfs's ATTR
> > > stuff, there are something like 5000 callers, and the entire API was
> > > designed to avoid refactoring all those callers from
> > > sysfs_kf_seq_show().
> >
> > What is wrong with the sysfs ATTR stuff? That should make it so that we
> > do not have to change any caller for any specific change like this, why
> > can't sysfs or kernfs handle it automatically?
>
> Hard to tell, since that would require _finding_ the sodding ->show()
> instances first. Good luck with that, seeing that most of those appear
> to come from templates-done-with-cpp...

I *think* I can get coccinelle to find them all, but my brute-force
approach was to just do a debug build changing the ATTR macro to be
typed, and changing the name of "show" and "store" in kobj_attribute
(to make the compiler find them all).

> AFAICS, Kees wants to protect against ->show() instances stomping beyond
> the page size. What I don't get is what do you get from using seq_file
> if you insist on doing raw access to the buffer rather than using
> seq_printf() and friends. What's the point?

To me, it looks like the kernfs/sysfs API happened around the time
"container_of" was gaining ground. It's trying to do the same thing
the "modern" callbacks do with finding a pointer from another, but it
did so by making sure everything had a 0 offset and an identical
beginning structure layout _but changed prototypes_.

It's the changed prototypes that freaks out CFI.

My current plan consists of these steps:

- add two new callbacks to the kobj_attribute struct (and its clones):
"seq_show" and "seq_store", which will pass in the seq_file.
- convert all callbacks to kobject/kboj_attribute and use container_of()
to find their respective pointers.
- remove "show" and "store"
- remove external use of seq_get_buf().

The first two steps require thousands of lines of code changed, so
I'm going to try to minimize it by trying to do as many conversions as
possible to the appropriate helpers first. e.g. DEVICE_ATTR_INT exists,
but there are only 2 users, yet there appears to be something like 500
DEVICE_ATTR callers that have an open-coded '%d':

$ git grep -B10 '\bDEVICE_ATTR' | grep '%d' | wc -l
530

--
Kees Cook

2021-03-17 10:46:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue, Mar 16, 2021 at 12:18:33PM -0700, Kees Cook wrote:
> On Tue, Mar 16, 2021 at 12:43:12PM +0000, Al Viro wrote:
> > On Tue, Mar 16, 2021 at 08:24:50AM +0100, Greg Kroah-Hartman wrote:
> >
> > > > Completely agreed. seq_get_buf() should be totally ripped out.
> > > > Unfortunately, this is going to be a long road because of sysfs's ATTR
> > > > stuff, there are something like 5000 callers, and the entire API was
> > > > designed to avoid refactoring all those callers from
> > > > sysfs_kf_seq_show().
> > >
> > > What is wrong with the sysfs ATTR stuff? That should make it so that we
> > > do not have to change any caller for any specific change like this, why
> > > can't sysfs or kernfs handle it automatically?
> >
> > Hard to tell, since that would require _finding_ the sodding ->show()
> > instances first. Good luck with that, seeing that most of those appear
> > to come from templates-done-with-cpp...
>
> I *think* I can get coccinelle to find them all, but my brute-force
> approach was to just do a debug build changing the ATTR macro to be
> typed, and changing the name of "show" and "store" in kobj_attribute
> (to make the compiler find them all).
>
> > AFAICS, Kees wants to protect against ->show() instances stomping beyond
> > the page size. What I don't get is what do you get from using seq_file
> > if you insist on doing raw access to the buffer rather than using
> > seq_printf() and friends. What's the point?
>
> To me, it looks like the kernfs/sysfs API happened around the time
> "container_of" was gaining ground. It's trying to do the same thing
> the "modern" callbacks do with finding a pointer from another, but it
> did so by making sure everything had a 0 offset and an identical
> beginning structure layout _but changed prototypes_.
>
> It's the changed prototypes that freaks out CFI.
>
> My current plan consists of these steps:
>
> - add two new callbacks to the kobj_attribute struct (and its clones):
> "seq_show" and "seq_store", which will pass in the seq_file.

Ick, why? Why should the callback care about seq_file? Shouldn't any
wrapper logic in the kobject code be able to handle this automatically?

> - convert all callbacks to kobject/kboj_attribute and use container_of()
> to find their respective pointers.

Which callbacks are you talking about here?

> - remove "show" and "store"

Hah!

> - remove external use of seq_get_buf().

So is this the main goal? I still don't understand the sequence file
problem here, what am I missing (becides the CFI stuff that is)?

> The first two steps require thousands of lines of code changed, so
> I'm going to try to minimize it by trying to do as many conversions as
> possible to the appropriate helpers first. e.g. DEVICE_ATTR_INT exists,
> but there are only 2 users, yet there appears to be something like 500
> DEVICE_ATTR callers that have an open-coded '%d':
>
> $ git grep -B10 '\bDEVICE_ATTR' | grep '%d' | wc -l
> 530

That's going to be hard, and a pain, and I really doubt all that useful
as I still can't figure out why this is needed...

thanks,

greg k-h

2021-03-17 12:10:34

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Tue 16-03-21 12:08:02, Kees Cook wrote:
> On Tue, Mar 16, 2021 at 09:31:23AM +0100, Michal Hocko wrote:
[...]
> > Also this cannot really be done for configurations with a very limited
> > vmalloc space (32b for example). Those systems are more and more rare
> > but you shouldn't really allow userspace to deplete the vmalloc space.
>
> This sounds like two objections:
> - 32b has a small vmalloc space
> - userspace shouldn't allow depletion of vmalloc space
>
> I'd be happy to make this 64b only. For the latter, I would imagine
> there are other vmalloc-exposed-to-userspace cases, but yes, this would
> be much more direct. Is that a problem in practice?

vmalloc space shouldn't be a problem for 64b systems but I am not sure
how does vmalloc scale with many small allocations. There were some
changes by Uladzislau who might give us more insight (CCed).

> > I would be also curious to see how vmalloc scales with huge number of
> > single page allocations which would be easy to trigger with this patch.
>
> Right -- what the best way to measure this (and what would be "too
> much")?

Proc is used quite heavily for all sorts of monitoring so I would be
worried about a noticeable slow down.

Btw. I still have problems with the approach. seq_file is intended to
provide safe way to dump values to the userspace. Sacrificing
performance just because of some abuser seems like a wrong way to go as
Al pointed out earlier. Can we simply stop the abuse and disallow to
manipulate the buffer directly? I do realize this might be more tricky
for reasons mentioned in other emails but this is definitely worth
doing.

--
Michal Hocko
SUSE Labs

2021-03-17 13:36:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> Btw. I still have problems with the approach. seq_file is intended to
> provide safe way to dump values to the userspace. Sacrificing
> performance just because of some abuser seems like a wrong way to go as
> Al pointed out earlier. Can we simply stop the abuse and disallow to
> manipulate the buffer directly? I do realize this might be more tricky
> for reasons mentioned in other emails but this is definitely worth
> doing.

We have to provide a buffer to "write into" somehow, so what is the best
way to stop "abuse" like this?

Right now, we do have helper functions, sysfs_emit(), that know to stop
the overflow of the buffer size, but porting the whole kernel to them is
going to take a bunch of churn, for almost no real benefit except a
potential random driver that might be doing bad things here that we have
not noticed yet.

Other than that, suggestions are welcome!

thanks,

greg k-h

2021-03-17 14:47:24

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed 17-03-21 14:34:27, Greg KH wrote:
> On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > Btw. I still have problems with the approach. seq_file is intended to
> > provide safe way to dump values to the userspace. Sacrificing
> > performance just because of some abuser seems like a wrong way to go as
> > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > manipulate the buffer directly? I do realize this might be more tricky
> > for reasons mentioned in other emails but this is definitely worth
> > doing.
>
> We have to provide a buffer to "write into" somehow, so what is the best
> way to stop "abuse" like this?

What is wrong about using seq_* interface directly?

> Right now, we do have helper functions, sysfs_emit(), that know to stop
> the overflow of the buffer size, but porting the whole kernel to them is
> going to take a bunch of churn, for almost no real benefit except a
> potential random driver that might be doing bad things here that we have
> not noticed yet.

I am not familiar with sysfs, I just got lost in all the indirection but
replacing buffer by the seq_file and operate on that should be possible,
no?

--
Michal Hocko
SUSE Labs

2021-03-17 15:50:07

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed 17-03-21 15:56:44, Greg KH wrote:
> On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > provide safe way to dump values to the userspace. Sacrificing
> > > > performance just because of some abuser seems like a wrong way to go as
> > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > for reasons mentioned in other emails but this is definitely worth
> > > > doing.
> > >
> > > We have to provide a buffer to "write into" somehow, so what is the best
> > > way to stop "abuse" like this?
> >
> > What is wrong about using seq_* interface directly?
>
> Right now every show() callback of sysfs would have to be changed :(

Is this really the case? Would it be too ugly to have an intermediate
buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
Sure one copy more than necessary but it this shouldn't be a hot path or
even visible on small strings. So that might be worth destroying an
inherently dangerous seq API (seq_get_buf).
--
Michal Hocko
SUSE Labs

2021-03-17 15:50:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> On Wed 17-03-21 15:56:44, Greg KH wrote:
> > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > doing.
> > > >
> > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > way to stop "abuse" like this?
> > >
> > > What is wrong about using seq_* interface directly?
> >
> > Right now every show() callback of sysfs would have to be changed :(
>
> Is this really the case? Would it be too ugly to have an intermediate
> buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.

Oh, good idea.

> Sure one copy more than necessary but it this shouldn't be a hot path or
> even visible on small strings. So that might be worth destroying an
> inherently dangerous seq API (seq_get_buf).

I'm all for that, let me see if I can carve out some time tomorrow to
try this out.

But, you don't get rid of the "ability" to have a driver write more than
a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
and do some internal checks (allocate a bunch of memory and check for
overflow by hand), if this is something to really be concerned about...

thanks,

greg k-h

2021-03-17 15:56:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> On Wed 17-03-21 14:34:27, Greg KH wrote:
> > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > Btw. I still have problems with the approach. seq_file is intended to
> > > provide safe way to dump values to the userspace. Sacrificing
> > > performance just because of some abuser seems like a wrong way to go as
> > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > manipulate the buffer directly? I do realize this might be more tricky
> > > for reasons mentioned in other emails but this is definitely worth
> > > doing.
> >
> > We have to provide a buffer to "write into" somehow, so what is the best
> > way to stop "abuse" like this?
>
> What is wrong about using seq_* interface directly?

Right now every show() callback of sysfs would have to be changed :(

> > Right now, we do have helper functions, sysfs_emit(), that know to stop
> > the overflow of the buffer size, but porting the whole kernel to them is
> > going to take a bunch of churn, for almost no real benefit except a
> > potential random driver that might be doing bad things here that we have
> > not noticed yet.
>
> I am not familiar with sysfs, I just got lost in all the indirection but
> replacing buffer by the seq_file and operate on that should be possible,
> no?

sysfs files should be very simple and easy, and have a single value
being written to userspace. I guess seq_printf() does handle the issue
of "big buffers", but there should not be a big buffer here to worry
about in the first place (yes, there was a bug where a driver took
unchecked data and sent it to userspace overflowing the buffer which
started this whole thread...)

I guess Kees wants to change all show functions to use the seq_ api,
which now makes a bit more sense, but still seems like a huge overkill.
But I now understand the idea here, the buffer management is handled by
the core kernel and overflows are impossible.

A "simpler" fix is to keep the api the same today, and just "force"
everyone to use sysfs_emit() which does the length checking
automatically.

I don't know, it all depends on how much effort we want to put into the
"drivers can not do stupid things because we prevent them from it"
type of work here...

thanks,

greg k-h

2021-03-17 17:20:08

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed 17-03-21 16:38:57, Greg KH wrote:
> On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > doing.
> > > > >
> > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > way to stop "abuse" like this?
> > > >
> > > > What is wrong about using seq_* interface directly?
> > >
> > > Right now every show() callback of sysfs would have to be changed :(
> >
> > Is this really the case? Would it be too ugly to have an intermediate
> > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
>
> Oh, good idea.
>
> > Sure one copy more than necessary but it this shouldn't be a hot path or
> > even visible on small strings. So that might be worth destroying an
> > inherently dangerous seq API (seq_get_buf).
>
> I'm all for that, let me see if I can carve out some time tomorrow to
> try this out.
>
> But, you don't get rid of the "ability" to have a driver write more than
> a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
> and do some internal checks (allocate a bunch of memory and check for
> overflow by hand), if this is something to really be concerned about...

Yes this is certainly possible and it will needs some way to address. My
point was that we shouldn't cripple seq_file just because the API allows
for an abuse. Sysfs needs to find a way to handle internal PAGE_SIZE
buffer assumption in any case.
--
Michal Hocko
SUSE Labs

2021-03-17 21:50:38

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed, Mar 17, 2021 at 04:38:57PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > doing.
> > > > >
> > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > way to stop "abuse" like this?
> > > >
> > > > What is wrong about using seq_* interface directly?
> > >
> > > Right now every show() callback of sysfs would have to be changed :(
> >
> > Is this really the case? Would it be too ugly to have an intermediate
> > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
>
> Oh, good idea.
>
> > Sure one copy more than necessary but it this shouldn't be a hot path or
> > even visible on small strings. So that might be worth destroying an
> > inherently dangerous seq API (seq_get_buf).
>
> I'm all for that, let me see if I can carve out some time tomorrow to
> try this out.

The trouble has been that C string APIs are just so impossibly fragile.
We just get too many bugs with it, so we really do need to rewrite the
callbacks to use seq_file, since it has a safe API.

I've been trying to write coccinelle scripts to do some of this
refactoring, but I have not found a silver bullet. (This is why I've
suggested adding the temporary "seq_show" and "seq_store" functions, so
we can transition all the callbacks without a flag day.)

> But, you don't get rid of the "ability" to have a driver write more than
> a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
> and do some internal checks (allocate a bunch of memory and check for
> overflow by hand), if this is something to really be concerned about...

Besides the CFI prototype enforcement changes (which I can build into
the new seq_show/seq_store callbacks), the buffer management is the
primary issue: we just can't hand drivers a string (even with a length)
because the C functions are terrible. e.g. just look at the snprintf vs
scnprintf -- we constantly have to just build completely new API when
what we need is a safe way (i.e. obfuscated away from the caller) to
build a string. Luckily seq_file does this already, so leaning into that
is good here.

--
Kees Cook

2021-03-18 08:09:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Wed, Mar 17, 2021 at 02:30:47PM -0700, Kees Cook wrote:
> On Wed, Mar 17, 2021 at 04:38:57PM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > > doing.
> > > > > >
> > > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > > way to stop "abuse" like this?
> > > > >
> > > > > What is wrong about using seq_* interface directly?
> > > >
> > > > Right now every show() callback of sysfs would have to be changed :(
> > >
> > > Is this really the case? Would it be too ugly to have an intermediate
> > > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
> >
> > Oh, good idea.
> >
> > > Sure one copy more than necessary but it this shouldn't be a hot path or
> > > even visible on small strings. So that might be worth destroying an
> > > inherently dangerous seq API (seq_get_buf).
> >
> > I'm all for that, let me see if I can carve out some time tomorrow to
> > try this out.
>
> The trouble has been that C string APIs are just so impossibly fragile.
> We just get too many bugs with it, so we really do need to rewrite the
> callbacks to use seq_file, since it has a safe API.
>
> I've been trying to write coccinelle scripts to do some of this
> refactoring, but I have not found a silver bullet. (This is why I've
> suggested adding the temporary "seq_show" and "seq_store" functions, so
> we can transition all the callbacks without a flag day.)
>
> > But, you don't get rid of the "ability" to have a driver write more than
> > a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
> > and do some internal checks (allocate a bunch of memory and check for
> > overflow by hand), if this is something to really be concerned about...
>
> Besides the CFI prototype enforcement changes (which I can build into
> the new seq_show/seq_store callbacks), the buffer management is the
> primary issue: we just can't hand drivers a string (even with a length)
> because the C functions are terrible. e.g. just look at the snprintf vs
> scnprintf -- we constantly have to just build completely new API when
> what we need is a safe way (i.e. obfuscated away from the caller) to
> build a string. Luckily seq_file does this already, so leaning into that
> is good here.

But, is it really worth the churn here?

Yes, strings in C is "hard", but this _should_ be a simple thing for any
driver to handle:
return sysfs_emit(buffer, "%d\n", my_dev->value);

To change that to:
return seq_printf(seq, "%d\n", my_dev->value);
feels very much "don't we have other more valuable things we could be
doing?"

So far we have found 1 driver that messed up and overflowed the buffer
that I know of. While reworking apis to make it "hard to get wrong" is
a great goal, the work involved here vs. any "protection" feels very
low.

How about moving everyone to sysfs_emit() first? That way it becomes
much more "obvious" when drivers are doing stupid things with their
sysfs buffer. But even then, it would not have caught the iscsi issue
as that was printing a user-provided string so maybe I'm just feeling
grumpy about the potential churn here...

I don't know...

greg k-h

2021-03-18 15:53:43

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Thu, Mar 18, 2021 at 09:07:45AM +0100, Greg Kroah-Hartman wrote:
> On Wed, Mar 17, 2021 at 02:30:47PM -0700, Kees Cook wrote:
> > On Wed, Mar 17, 2021 at 04:38:57PM +0100, Greg Kroah-Hartman wrote:
> > > On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > > > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > > > doing.
> > > > > > >
> > > > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > > > way to stop "abuse" like this?
> > > > > >
> > > > > > What is wrong about using seq_* interface directly?
> > > > >
> > > > > Right now every show() callback of sysfs would have to be changed :(
> > > >
> > > > Is this really the case? Would it be too ugly to have an intermediate
> > > > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
> > >
> > > Oh, good idea.
> > >
> > > > Sure one copy more than necessary but it this shouldn't be a hot path or
> > > > even visible on small strings. So that might be worth destroying an
> > > > inherently dangerous seq API (seq_get_buf).
> > >
> > > I'm all for that, let me see if I can carve out some time tomorrow to
> > > try this out.
> >
> > The trouble has been that C string APIs are just so impossibly fragile.
> > We just get too many bugs with it, so we really do need to rewrite the
> > callbacks to use seq_file, since it has a safe API.
> >
> > I've been trying to write coccinelle scripts to do some of this
> > refactoring, but I have not found a silver bullet. (This is why I've
> > suggested adding the temporary "seq_show" and "seq_store" functions, so
> > we can transition all the callbacks without a flag day.)
> >
> > > But, you don't get rid of the "ability" to have a driver write more than
> > > a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
> > > and do some internal checks (allocate a bunch of memory and check for
> > > overflow by hand), if this is something to really be concerned about...
> >
> > Besides the CFI prototype enforcement changes (which I can build into
> > the new seq_show/seq_store callbacks), the buffer management is the
> > primary issue: we just can't hand drivers a string (even with a length)
> > because the C functions are terrible. e.g. just look at the snprintf vs
> > scnprintf -- we constantly have to just build completely new API when
> > what we need is a safe way (i.e. obfuscated away from the caller) to
> > build a string. Luckily seq_file does this already, so leaning into that
> > is good here.
>
> But, is it really worth the churn here?
>
> Yes, strings in C is "hard", but this _should_ be a simple thing for any
> driver to handle:
> return sysfs_emit(buffer, "%d\n", my_dev->value);
>
> To change that to:
> return seq_printf(seq, "%d\n", my_dev->value);
> feels very much "don't we have other more valuable things we could be
> doing?"
>
> So far we have found 1 driver that messed up and overflowed the buffer
> that I know of. While reworking apis to make it "hard to get wrong" is
> a great goal, the work involved here vs. any "protection" feels very
> low.

I haven't been keeping a list, but it's not the only one. The _other_
reason we need seq_file is so we can perform checks against f_cred for
things like %p obfuscation (as was needed for modules that I hacked
around) and is needed a proper bug fix for the kernel pointer exposure
bug from the same batch. So now I'm up to 3 distinct reasons that the
sysfs API is lacking -- I think it's worth the churn and time.

> How about moving everyone to sysfs_emit() first? That way it becomes
> much more "obvious" when drivers are doing stupid things with their
> sysfs buffer. But even then, it would not have caught the iscsi issue
> as that was printing a user-provided string so maybe I'm just feeling
> grumpy about the potential churn here...

I need to fix the prototypes for CFI sanity too. Switching to seq_file
solves 2 problems, and if we have to change the prototype once for that,
we can include the prototype fixes for CFI at the same time to avoid
double the churn.

--
Kees Cook

2021-03-18 17:58:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] seq_file: Unconditionally use vmalloc for buffer

On Thu, Mar 18, 2021 at 08:51:45AM -0700, Kees Cook wrote:
> On Thu, Mar 18, 2021 at 09:07:45AM +0100, Greg Kroah-Hartman wrote:
> > On Wed, Mar 17, 2021 at 02:30:47PM -0700, Kees Cook wrote:
> > > On Wed, Mar 17, 2021 at 04:38:57PM +0100, Greg Kroah-Hartman wrote:
> > > > On Wed, Mar 17, 2021 at 04:20:52PM +0100, Michal Hocko wrote:
> > > > > On Wed 17-03-21 15:56:44, Greg KH wrote:
> > > > > > On Wed, Mar 17, 2021 at 03:44:16PM +0100, Michal Hocko wrote:
> > > > > > > On Wed 17-03-21 14:34:27, Greg KH wrote:
> > > > > > > > On Wed, Mar 17, 2021 at 01:08:21PM +0100, Michal Hocko wrote:
> > > > > > > > > Btw. I still have problems with the approach. seq_file is intended to
> > > > > > > > > provide safe way to dump values to the userspace. Sacrificing
> > > > > > > > > performance just because of some abuser seems like a wrong way to go as
> > > > > > > > > Al pointed out earlier. Can we simply stop the abuse and disallow to
> > > > > > > > > manipulate the buffer directly? I do realize this might be more tricky
> > > > > > > > > for reasons mentioned in other emails but this is definitely worth
> > > > > > > > > doing.
> > > > > > > >
> > > > > > > > We have to provide a buffer to "write into" somehow, so what is the best
> > > > > > > > way to stop "abuse" like this?
> > > > > > >
> > > > > > > What is wrong about using seq_* interface directly?
> > > > > >
> > > > > > Right now every show() callback of sysfs would have to be changed :(
> > > > >
> > > > > Is this really the case? Would it be too ugly to have an intermediate
> > > > > buffer and then seq_puts it into the seq file inside sysfs_kf_seq_show.
> > > >
> > > > Oh, good idea.
> > > >
> > > > > Sure one copy more than necessary but it this shouldn't be a hot path or
> > > > > even visible on small strings. So that might be worth destroying an
> > > > > inherently dangerous seq API (seq_get_buf).
> > > >
> > > > I'm all for that, let me see if I can carve out some time tomorrow to
> > > > try this out.
> > >
> > > The trouble has been that C string APIs are just so impossibly fragile.
> > > We just get too many bugs with it, so we really do need to rewrite the
> > > callbacks to use seq_file, since it has a safe API.
> > >
> > > I've been trying to write coccinelle scripts to do some of this
> > > refactoring, but I have not found a silver bullet. (This is why I've
> > > suggested adding the temporary "seq_show" and "seq_store" functions, so
> > > we can transition all the callbacks without a flag day.)
> > >
> > > > But, you don't get rid of the "ability" to have a driver write more than
> > > > a PAGE_SIZE into the buffer passed to it. I guess I could be paranoid
> > > > and do some internal checks (allocate a bunch of memory and check for
> > > > overflow by hand), if this is something to really be concerned about...
> > >
> > > Besides the CFI prototype enforcement changes (which I can build into
> > > the new seq_show/seq_store callbacks), the buffer management is the
> > > primary issue: we just can't hand drivers a string (even with a length)
> > > because the C functions are terrible. e.g. just look at the snprintf vs
> > > scnprintf -- we constantly have to just build completely new API when
> > > what we need is a safe way (i.e. obfuscated away from the caller) to
> > > build a string. Luckily seq_file does this already, so leaning into that
> > > is good here.
> >
> > But, is it really worth the churn here?
> >
> > Yes, strings in C is "hard", but this _should_ be a simple thing for any
> > driver to handle:
> > return sysfs_emit(buffer, "%d\n", my_dev->value);
> >
> > To change that to:
> > return seq_printf(seq, "%d\n", my_dev->value);
> > feels very much "don't we have other more valuable things we could be
> > doing?"
> >
> > So far we have found 1 driver that messed up and overflowed the buffer
> > that I know of. While reworking apis to make it "hard to get wrong" is
> > a great goal, the work involved here vs. any "protection" feels very
> > low.
>
> I haven't been keeping a list, but it's not the only one. The _other_
> reason we need seq_file is so we can perform checks against f_cred for
> things like %p obfuscation (as was needed for modules that I hacked
> around) and is needed a proper bug fix for the kernel pointer exposure
> bug from the same batch. So now I'm up to 3 distinct reasons that the
> sysfs API is lacking -- I think it's worth the churn and time.

Ok, if you think so.

But if we do this, can we not do a "raw" seqfile api? I would like to
see only 1 function that works like sysfs_emit() does. Perhaps:
void sysfs_printf(struct attribute *attr, const char *fmt, ...);

and then from there we can "derive" things like:
void device_printf(struct device_attribute *attr, const char *fmt, ...);

You can "hide" the needed seq_file structure in the attribute structure
for the buffer management, but I don't think we need the crazy multiple
ways that seq_printf() has morphed into over the years, right?

seq_path() anyone?

binary attribute files are a totally different thing, and probably can
just be left alone for now.

> > How about moving everyone to sysfs_emit() first? That way it becomes
> > much more "obvious" when drivers are doing stupid things with their
> > sysfs buffer. But even then, it would not have caught the iscsi issue
> > as that was printing a user-provided string so maybe I'm just feeling
> > grumpy about the potential churn here...
>
> I need to fix the prototypes for CFI sanity too. Switching to seq_file
> solves 2 problems, and if we have to change the prototype once for that,
> we can include the prototype fixes for CFI at the same time to avoid
> double the churn.

Yes, let's not go through this twice...

thanks,

greg k-h

2021-03-19 14:14:19

by kernel test robot

[permalink] [raw]
Subject: [seq_file] 5fd6060e50: stress-ng.eventfd.ops_per_sec -49.1% regression



Greeting,

FYI, we noticed a -49.1% regression of stress-ng.eventfd.ops_per_sec due to commit:


commit: 5fd6060e506cc226f2575b01baef7af9ca76aa44 ("[PATCH v2] seq_file: Unconditionally use vmalloc for buffer")
url: https://github.com/0day-ci/linux/commits/Kees-Cook/seq_file-Unconditionally-use-vmalloc-for-buffer/20210316-015127
base: https://git.kernel.org/cgit/linux/kernel/git/kees/linux.git for-next/pstore

in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

nr_threads: 10%
disk: 1HDD
testtime: 60s
fs: ext4
class: os
test: eventfd
cpufreq_governor: performance
ucode: 0x5003006




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml
bin/lkp run compatible-job.yaml

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/eventfd/stress-ng/60s/0x5003006

commit:
7db688e99c ("pstore/ram: Rate-limit "uncorrectable error in header" message")
5fd6060e50 ("seq_file: Unconditionally use vmalloc for buffer")

7db688e99c0f770a 5fd6060e506cc226f2575b01bae
---------------- ---------------------------
%stddev %change %stddev
\ | \
32516754 -49.1% 16547637 stress-ng.eventfd.ops
541944 -49.1% 275793 stress-ng.eventfd.ops_per_sec
55102 ? 2% +7.2% 59091 ? 2% stress-ng.time.involuntary_context_switches
929.67 -3.9% 893.67 stress-ng.time.percent_of_cpu_this_job_got
30.90 -49.0% 15.77 ? 2% stress-ng.time.user_time
65019233 -49.1% 33087762 stress-ng.time.voluntary_context_switches
8.90 +2.7% 9.14 iostat.cpu.system
38002 ? 5% -11.0% 33835 ? 11% sched_debug.cfs_rq:/.min_vruntime.max
2004321 -49.0% 1021836 vmstat.system.cs
0.11 ? 2% -0.0 0.09 mpstat.cpu.all.soft%
0.46 -0.2 0.26 ? 2% mpstat.cpu.all.usr%
261727 ? 6% +2257.7% 6170837 ? 9% numa-numastat.node0.local_node
309032 ? 9% +1909.5% 6209968 ? 9% numa-numastat.node0.numa_hit
317574 ? 8% +3280.6% 10735840 ? 7% numa-numastat.node1.local_node
356782 ? 10% +2922.4% 10783284 ? 7% numa-numastat.node1.numa_hit
1090175 ? 18% +235.6% 3658217 ? 10% numa-vmstat.node0.numa_hit
999598 ? 15% +262.0% 3618161 ? 10% numa-vmstat.node0.numa_local
949368 ? 21% +567.3% 6335563 ? 8% numa-vmstat.node1.numa_hit
787502 ? 19% +677.6% 6123276 ? 8% numa-vmstat.node1.numa_local
4.116e+08 -24.7% 3.099e+08 ? 3% cpuidle.C1.time
35296067 ? 2% -63.7% 12799762 ? 8% cpuidle.C1.usage
8130367 ? 24% +76.7% 14367098 ? 16% cpuidle.C1E.usage
61265170 ? 2% -50.7% 30225048 cpuidle.POLL.time
30192926 ? 3% -49.4% 15263734 ? 3% cpuidle.POLL.usage
585.67 ?103% -77.6% 131.33 ? 47% proc-vmstat.numa_hint_faults
454.67 ?131% -86.3% 62.17 ? 98% proc-vmstat.numa_hint_faults_local
696009 +2325.3% 16880339 proc-vmstat.numa_hit
609418 +2655.7% 16793735 proc-vmstat.numa_local
1040066 ? 2% +1546.5% 17125024 proc-vmstat.pgalloc_normal
911730 ? 2% +1764.3% 16996996 proc-vmstat.pgfree
37931 ? 3% -11.8% 33457 slabinfo.filp.active_objs
1204 ? 3% -12.3% 1057 slabinfo.filp.active_slabs
38571 ? 3% -12.3% 33837 slabinfo.filp.num_objs
1204 ? 3% -12.3% 1057 slabinfo.filp.num_slabs
14031 ? 2% -24.8% 10556 slabinfo.kmalloc-256.active_objs
442.33 ? 2% -25.5% 329.50 slabinfo.kmalloc-256.active_slabs
14165 ? 2% -25.4% 10562 slabinfo.kmalloc-256.num_objs
442.33 ? 2% -25.5% 329.50 slabinfo.kmalloc-256.num_slabs
9871 +309.4% 40411 ? 5% slabinfo.vmap_area.active_objs
154.00 +310.3% 631.83 ? 4% slabinfo.vmap_area.active_slabs
9887 +309.2% 40463 ? 4% slabinfo.vmap_area.num_objs
154.00 +310.3% 631.83 ? 4% slabinfo.vmap_area.num_slabs
19298 ? 13% -52.4% 9195 ? 10% softirqs.CPU15.RCU
10810 ? 5% -18.3% 8826 ? 14% softirqs.CPU15.SCHED
20339 ? 27% -43.2% 11558 ? 19% softirqs.CPU16.RCU
19915 ? 25% -38.9% 12165 ? 15% softirqs.CPU17.RCU
43084 ? 16% -45.4% 23524 ? 34% softirqs.CPU2.RCU
31777 ? 48% -43.6% 17935 ? 17% softirqs.CPU26.RCU
39011 ? 23% -66.2% 13199 ? 24% softirqs.CPU32.RCU
11700 ? 5% -13.2% 10158 ? 6% softirqs.CPU32.SCHED
10293 ? 5% +12.4% 11570 ? 2% softirqs.CPU44.SCHED
10704 ? 3% +14.0% 12201 ? 9% softirqs.CPU45.SCHED
33336 ? 17% -50.4% 16549 ? 39% softirqs.CPU5.RCU
24875 ? 15% -38.3% 15340 ? 29% softirqs.CPU53.RCU
26395 ? 33% -54.3% 12055 ? 19% softirqs.CPU55.RCU
28254 ? 33% -52.7% 13377 ? 24% softirqs.CPU61.RCU
29803 ? 37% -63.0% 11033 ? 25% softirqs.CPU63.RCU
22122 ? 17% -37.6% 13802 ? 23% softirqs.CPU7.RCU
32270 ? 25% -47.7% 16871 ? 20% softirqs.CPU78.RCU
31018 ? 26% -49.1% 15784 ? 45% softirqs.CPU81.RCU
10264 ? 3% +16.5% 11957 ? 11% softirqs.CPU84.SCHED
10086 ? 6% +17.2% 11819 ? 8% softirqs.CPU86.SCHED
2302045 -28.8% 1638717 softirqs.RCU
0.01 ? 42% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
0.01 ?100% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
0.97 ? 99% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
0.01 ? 14% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
0.28 ?142% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
0.01 ? 65% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
0.82 ? 46% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.00 ? 22% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
0.01 ? 17% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
0.58 ?182% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
5.89 ? 99% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
0.01 ? 14% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
1.34 ?140% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
0.21 ?211% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
3.21 ? 15% -100.0% 0.00 perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.13 ?188% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
0.10 ?128% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
3.21 ? 25% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
3.03 ? 26% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
0.07 ?140% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
0.82 ? 46% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
7.17 ? 32% -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
172.33 ? 96% -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
96.17 -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.stop_one_cpu
6.67 ? 20% -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
4.67 ? 42% -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
96.83 -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
4.83 ? 45% -100.0% 0.00 perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.61 ?194% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
1.12 ? 83% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
11.52 ? 22% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
4.32 ? 10% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
5.92 ?152% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
3.21 ? 15% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
0.09 ?142% -100.0% 0.00 perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
2.24 ? 19% -100.0% 0.00 perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
2.74 ? 19% -100.0% 0.00 perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
0.70 ? 76% -100.0% 0.00 perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
9.24 ? 24% -100.0% 0.00 perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
4.31 ? 10% -100.0% 0.00 perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
17.80 ? 3% -25.3% 13.29 ? 4% perf-stat.i.MPKI
3.91e+09 -36.6% 2.479e+09 perf-stat.i.branch-instructions
1.16 +0.1 1.24 ? 3% perf-stat.i.branch-miss-rate%
44593160 -32.9% 29918950 ? 2% perf-stat.i.branch-misses
45190201 ? 6% -51.9% 21750001 ? 6% perf-stat.i.cache-misses
3.369e+08 ? 3% -54.6% 1.529e+08 ? 3% perf-stat.i.cache-references
2069040 -49.0% 1054803 perf-stat.i.context-switches
1.90 +42.6% 2.71 perf-stat.i.cpi
3.524e+10 -11.4% 3.123e+10 perf-stat.i.cpu-cycles
171.19 ? 4% -16.7% 142.54 ? 2% perf-stat.i.cpu-migrations
897.91 ? 6% +72.0% 1544 ? 4% perf-stat.i.cycles-between-cache-misses
0.00 ? 55% +0.0 0.02 ? 26% perf-stat.i.dTLB-load-miss-rate%
90639 ? 20% +379.4% 434539 ? 10% perf-stat.i.dTLB-load-misses
5.159e+09 -39.3% 3.133e+09 perf-stat.i.dTLB-loads
0.00 ? 24% +0.0 0.00 ? 25% perf-stat.i.dTLB-store-miss-rate%
25252 ? 10% +104.3% 51578 ? 12% perf-stat.i.dTLB-store-misses
3.225e+09 -44.8% 1.781e+09 perf-stat.i.dTLB-stores
76.00 -3.5 72.47 perf-stat.i.iTLB-load-miss-rate%
34565197 -48.9% 17646961 perf-stat.i.iTLB-load-misses
10478017 ? 4% -37.8% 6515713 ? 5% perf-stat.i.iTLB-loads
1.851e+10 -38.1% 1.146e+10 perf-stat.i.instructions
583.76 +20.3% 702.31 perf-stat.i.instructions-per-iTLB-miss
0.53 -29.0% 0.38 perf-stat.i.ipc
0.37 -11.4% 0.33 perf-stat.i.metric.GHz
0.59 ? 7% +31.5% 0.78 perf-stat.i.metric.K/sec
131.90 -40.2% 78.82 perf-stat.i.metric.M/sec
87.72 +8.1 95.81 perf-stat.i.node-load-miss-rate%
1334578 ? 3% -79.2% 277059 ? 8% perf-stat.i.node-loads
7293953 ? 4% -22.5% 5655933 ? 4% perf-stat.i.node-store-misses
18.20 ? 3% -26.7% 13.34 ? 3% perf-stat.overall.MPKI
1.14 +0.1 1.21 perf-stat.overall.branch-miss-rate%
1.90 +43.1% 2.73 perf-stat.overall.cpi
783.04 ? 6% +84.2% 1441 ? 6% perf-stat.overall.cycles-between-cache-misses
0.00 ? 20% +0.0 0.01 ? 9% perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 10% +0.0 0.00 ? 12% perf-stat.overall.dTLB-store-miss-rate%
76.74 -3.7 73.05 perf-stat.overall.iTLB-load-miss-rate%
535.61 +21.3% 649.61 perf-stat.overall.instructions-per-iTLB-miss
0.53 -30.1% 0.37 perf-stat.overall.ipc
88.60 +8.3 96.86 perf-stat.overall.node-load-miss-rate%
3.847e+09 -36.6% 2.439e+09 perf-stat.ps.branch-instructions
43872091 -32.9% 29436754 ? 2% perf-stat.ps.branch-misses
44454773 ? 6% -51.9% 21395194 ? 6% perf-stat.ps.cache-misses
3.314e+08 ? 3% -54.6% 1.504e+08 ? 3% perf-stat.ps.cache-references
2035394 -49.0% 1037584 perf-stat.ps.context-switches
3.467e+10 -11.4% 3.073e+10 perf-stat.ps.cpu-cycles
168.47 ? 4% -16.7% 140.28 ? 2% perf-stat.ps.cpu-migrations
89350 ? 20% +378.8% 427772 ? 10% perf-stat.ps.dTLB-load-misses
5.075e+09 -39.3% 3.082e+09 perf-stat.ps.dTLB-loads
24898 ? 10% +104.2% 50844 ? 12% perf-stat.ps.dTLB-store-misses
3.173e+09 -44.8% 1.752e+09 perf-stat.ps.dTLB-stores
34003945 -48.9% 17359689 perf-stat.ps.iTLB-load-misses
10308325 ? 4% -37.8% 6410071 ? 5% perf-stat.ps.iTLB-loads
1.821e+10 -38.1% 1.128e+10 perf-stat.ps.instructions
1312891 ? 3% -79.2% 272597 ? 8% perf-stat.ps.node-loads
7175119 ? 4% -22.5% 5563364 ? 4% perf-stat.ps.node-store-misses
1.151e+12 -38.1% 7.13e+11 perf-stat.total.instructions
79831 ? 5% +55.8% 124388 ? 3% interrupts.CAL:Function_call_interrupts
911.17 ? 40% -74.2% 235.33 ? 29% interrupts.CPU0.RES:Rescheduling_interrupts
1922 ? 31% -77.2% 437.33 ? 51% interrupts.CPU1.RES:Rescheduling_interrupts
747.83 ? 18% +66.5% 1244 ? 9% interrupts.CPU10.CAL:Function_call_interrupts
736.67 ? 44% -70.5% 217.17 ? 44% interrupts.CPU10.RES:Rescheduling_interrupts
757.17 ? 20% +122.0% 1680 ? 58% interrupts.CPU11.CAL:Function_call_interrupts
800.00 ? 38% -68.5% 252.17 ? 41% interrupts.CPU11.RES:Rescheduling_interrupts
718.50 ? 19% +95.8% 1407 ? 26% interrupts.CPU12.CAL:Function_call_interrupts
651.00 ? 35% -72.1% 181.83 ? 78% interrupts.CPU12.RES:Rescheduling_interrupts
695.33 ? 42% -74.4% 178.17 ? 55% interrupts.CPU13.RES:Rescheduling_interrupts
676.67 ? 25% +219.5% 2162 ? 65% interrupts.CPU14.CAL:Function_call_interrupts
810.17 ? 45% -79.9% 162.50 ? 91% interrupts.CPU14.RES:Rescheduling_interrupts
3349 ? 62% -89.8% 340.50 ? 52% interrupts.CPU15.NMI:Non-maskable_interrupts
3349 ? 62% -89.8% 340.50 ? 52% interrupts.CPU15.PMI:Performance_monitoring_interrupts
477.33 ? 17% -44.1% 266.67 ? 64% interrupts.CPU15.RES:Rescheduling_interrupts
655.83 ? 26% -70.1% 196.00 ? 63% interrupts.CPU16.RES:Rescheduling_interrupts
611.33 ? 35% -69.8% 184.83 ? 71% interrupts.CPU17.RES:Rescheduling_interrupts
724.83 ? 40% -71.2% 209.00 ? 45% interrupts.CPU18.RES:Rescheduling_interrupts
608.67 ? 34% -68.2% 193.50 ?110% interrupts.CPU19.RES:Rescheduling_interrupts
1701 ? 32% -74.2% 438.83 ? 42% interrupts.CPU2.RES:Rescheduling_interrupts
599.33 ? 19% -64.3% 213.83 ? 76% interrupts.CPU21.RES:Rescheduling_interrupts
926.50 ? 41% -75.2% 230.17 ? 79% interrupts.CPU22.RES:Rescheduling_interrupts
630.00 ? 19% -71.4% 180.17 ? 55% interrupts.CPU23.RES:Rescheduling_interrupts
1787 ? 28% -60.1% 712.50 ? 31% interrupts.CPU24.RES:Rescheduling_interrupts
1343 ? 21% -59.3% 547.33 ? 16% interrupts.CPU25.RES:Rescheduling_interrupts
775.17 ? 17% +55.6% 1206 ? 4% interrupts.CPU26.CAL:Function_call_interrupts
1566 ? 34% -63.9% 565.83 ? 31% interrupts.CPU26.RES:Rescheduling_interrupts
679.67 ? 23% +84.6% 1254 ? 10% interrupts.CPU27.CAL:Function_call_interrupts
1631 ? 23% -60.4% 646.33 ? 35% interrupts.CPU27.RES:Rescheduling_interrupts
731.17 ? 24% +74.2% 1274 ? 8% interrupts.CPU28.CAL:Function_call_interrupts
1339 ? 31% -59.0% 549.50 ? 49% interrupts.CPU29.RES:Rescheduling_interrupts
883.50 ? 15% +81.9% 1607 ? 41% interrupts.CPU3.CAL:Function_call_interrupts
1237 ? 37% -78.0% 272.67 ? 46% interrupts.CPU3.RES:Rescheduling_interrupts
736.33 ? 20% +81.3% 1334 ? 10% interrupts.CPU30.CAL:Function_call_interrupts
1310 ? 26% -62.1% 497.00 ? 34% interrupts.CPU30.RES:Rescheduling_interrupts
840.83 ? 25% +52.4% 1281 ? 16% interrupts.CPU32.CAL:Function_call_interrupts
893.00 ? 35% -67.6% 289.67 ? 39% interrupts.CPU32.RES:Rescheduling_interrupts
2714 ? 73% -76.0% 650.83 ?107% interrupts.CPU33.NMI:Non-maskable_interrupts
2714 ? 73% -76.0% 650.83 ?107% interrupts.CPU33.PMI:Performance_monitoring_interrupts
1023 ? 20% -52.0% 490.67 ? 34% interrupts.CPU33.RES:Rescheduling_interrupts
754.33 ? 15% +62.8% 1228 ? 13% interrupts.CPU34.CAL:Function_call_interrupts
1069 ? 39% -64.5% 379.33 ? 43% interrupts.CPU34.RES:Rescheduling_interrupts
762.50 ? 22% +53.0% 1166 ? 4% interrupts.CPU35.CAL:Function_call_interrupts
1232 ? 11% -57.9% 518.33 ? 44% interrupts.CPU35.RES:Rescheduling_interrupts
812.50 ? 20% +56.7% 1273 ? 15% interrupts.CPU36.CAL:Function_call_interrupts
721.33 ? 28% +71.3% 1235 ? 6% interrupts.CPU38.CAL:Function_call_interrupts
784.17 ? 21% +83.4% 1438 ? 19% interrupts.CPU4.CAL:Function_call_interrupts
1249 ? 42% -83.2% 210.33 ? 33% interrupts.CPU4.RES:Rescheduling_interrupts
727.17 ? 11% +121.1% 1608 ? 61% interrupts.CPU40.CAL:Function_call_interrupts
661.67 ? 14% +75.0% 1157 ? 5% interrupts.CPU41.CAL:Function_call_interrupts
936.50 ? 37% -67.1% 308.17 ? 45% interrupts.CPU41.RES:Rescheduling_interrupts
673.33 ? 17% +88.8% 1271 ? 6% interrupts.CPU43.CAL:Function_call_interrupts
1008 ? 23% -48.2% 522.83 ? 35% interrupts.CPU43.RES:Rescheduling_interrupts
647.00 ? 24% +80.2% 1165 ? 3% interrupts.CPU44.CAL:Function_call_interrupts
690.67 ? 20% +68.5% 1163 ? 3% interrupts.CPU47.CAL:Function_call_interrupts
1016 ? 32% -58.8% 418.83 ? 38% interrupts.CPU47.RES:Rescheduling_interrupts
591.83 ? 49% -74.1% 153.00 ? 52% interrupts.CPU48.RES:Rescheduling_interrupts
781.17 ? 23% +54.2% 1204 ? 9% interrupts.CPU49.CAL:Function_call_interrupts
826.17 ? 49% +250.2% 2892 ? 69% interrupts.CPU49.NMI:Non-maskable_interrupts
826.17 ? 49% +250.2% 2892 ? 69% interrupts.CPU49.PMI:Performance_monitoring_interrupts
946.83 ? 40% -68.9% 294.17 ? 52% interrupts.CPU5.RES:Rescheduling_interrupts
760.83 ? 20% +55.8% 1185 ? 8% interrupts.CPU50.CAL:Function_call_interrupts
729.33 ? 34% -69.5% 222.33 ? 68% interrupts.CPU50.RES:Rescheduling_interrupts
756.83 ? 13% +59.1% 1204 ? 6% interrupts.CPU51.CAL:Function_call_interrupts
669.83 ? 13% +78.1% 1193 ? 11% interrupts.CPU52.CAL:Function_call_interrupts
665.50 ? 33% -59.5% 269.67 ? 49% interrupts.CPU52.RES:Rescheduling_interrupts
704.50 ? 12% +73.0% 1218 ? 5% interrupts.CPU53.CAL:Function_call_interrupts
691.00 ? 12% +67.5% 1157 ? 3% interrupts.CPU54.CAL:Function_call_interrupts
663.00 ? 10% +78.8% 1185 ? 7% interrupts.CPU55.CAL:Function_call_interrupts
647.50 ? 15% +80.8% 1170 ? 5% interrupts.CPU56.CAL:Function_call_interrupts
644.83 ? 15% +82.3% 1175 ? 6% interrupts.CPU57.CAL:Function_call_interrupts
689.17 ? 12% +70.0% 1171 ? 4% interrupts.CPU58.CAL:Function_call_interrupts
667.67 ? 39% -77.1% 152.67 ? 42% interrupts.CPU58.RES:Rescheduling_interrupts
638.50 ? 17% +82.7% 1166 interrupts.CPU59.CAL:Function_call_interrupts
525.17 ? 27% -70.4% 155.67 ? 55% interrupts.CPU59.RES:Rescheduling_interrupts
936.83 ? 36% -65.7% 321.33 ? 52% interrupts.CPU6.RES:Rescheduling_interrupts
738.50 ? 18% +121.2% 1633 ? 60% interrupts.CPU60.CAL:Function_call_interrupts
606.33 ? 48% -61.6% 233.00 ? 35% interrupts.CPU60.RES:Rescheduling_interrupts
694.17 ? 18% +74.7% 1212 ? 8% interrupts.CPU61.CAL:Function_call_interrupts
654.33 ? 22% +85.2% 1211 ? 13% interrupts.CPU62.CAL:Function_call_interrupts
502.00 ? 60% -70.9% 146.17 ? 66% interrupts.CPU62.RES:Rescheduling_interrupts
712.33 ? 10% +84.3% 1313 ? 23% interrupts.CPU63.CAL:Function_call_interrupts
3704 ? 60% -89.1% 402.83 ? 51% interrupts.CPU63.NMI:Non-maskable_interrupts
3704 ? 60% -89.1% 402.83 ? 51% interrupts.CPU63.PMI:Performance_monitoring_interrupts
617.33 ? 29% -60.2% 245.50 ? 57% interrupts.CPU63.RES:Rescheduling_interrupts
650.33 ? 14% +80.7% 1174 ? 6% interrupts.CPU64.CAL:Function_call_interrupts
669.00 ? 21% +75.9% 1176 ? 8% interrupts.CPU65.CAL:Function_call_interrupts
718.67 ? 31% +77.6% 1276 ? 14% interrupts.CPU66.CAL:Function_call_interrupts
715.33 ? 15% +58.9% 1136 ? 4% interrupts.CPU67.CAL:Function_call_interrupts
602.50 ? 39% -71.3% 172.67 ? 62% interrupts.CPU67.RES:Rescheduling_interrupts
578.17 ? 6% +175.3% 1591 ? 60% interrupts.CPU68.CAL:Function_call_interrupts
660.33 ? 13% +76.9% 1168 ? 5% interrupts.CPU69.CAL:Function_call_interrupts
728.67 ? 25% +58.2% 1152 ? 4% interrupts.CPU7.CAL:Function_call_interrupts
815.00 ? 47% -77.9% 180.50 ? 60% interrupts.CPU7.RES:Rescheduling_interrupts
671.50 ? 17% +78.2% 1196 ? 12% interrupts.CPU70.CAL:Function_call_interrupts
594.33 ? 60% -77.8% 132.17 ? 51% interrupts.CPU70.RES:Rescheduling_interrupts
730.00 ? 30% +62.5% 1186 ? 4% interrupts.CPU71.CAL:Function_call_interrupts
660.67 ? 32% -82.1% 118.50 ? 40% interrupts.CPU71.RES:Rescheduling_interrupts
859.50 ? 22% +43.0% 1229 ? 6% interrupts.CPU72.CAL:Function_call_interrupts
1160 ? 41% -68.4% 366.33 ? 22% interrupts.CPU72.RES:Rescheduling_interrupts
811.83 ? 18% +59.2% 1292 ? 15% interrupts.CPU73.CAL:Function_call_interrupts
851.33 ? 20% +50.1% 1277 ? 10% interrupts.CPU74.CAL:Function_call_interrupts
1067 ? 22% -53.7% 493.83 ? 52% interrupts.CPU74.RES:Rescheduling_interrupts
858.83 ? 10% +44.4% 1240 ? 9% interrupts.CPU75.CAL:Function_call_interrupts
747.50 ? 13% +60.0% 1196 ? 5% interrupts.CPU76.CAL:Function_call_interrupts
757.83 ? 15% +57.1% 1190 ? 7% interrupts.CPU77.CAL:Function_call_interrupts
965.83 ? 22% -61.7% 369.50 ? 32% interrupts.CPU79.RES:Rescheduling_interrupts
654.83 ? 16% +84.7% 1209 ? 13% interrupts.CPU8.CAL:Function_call_interrupts
667.33 ? 53% -70.2% 199.17 ? 79% interrupts.CPU8.RES:Rescheduling_interrupts
773.33 ? 17% +65.6% 1281 ? 13% interrupts.CPU80.CAL:Function_call_interrupts
2552 ? 31% -60.6% 1004 ? 73% interrupts.CPU80.NMI:Non-maskable_interrupts
2552 ? 31% -60.6% 1004 ? 73% interrupts.CPU80.PMI:Performance_monitoring_interrupts
962.17 ? 21% -60.4% 381.00 ? 14% interrupts.CPU80.RES:Rescheduling_interrupts
769.33 ? 20% +53.9% 1184 ? 10% interrupts.CPU81.CAL:Function_call_interrupts
2283 ? 56% -71.1% 658.83 ? 73% interrupts.CPU81.NMI:Non-maskable_interrupts
2283 ? 56% -71.1% 658.83 ? 73% interrupts.CPU81.PMI:Performance_monitoring_interrupts
717.83 ? 18% -40.2% 429.50 ? 31% interrupts.CPU81.RES:Rescheduling_interrupts
635.33 ? 6% +88.8% 1199 ? 8% interrupts.CPU82.CAL:Function_call_interrupts
673.00 ? 16% +79.9% 1210 ? 5% interrupts.CPU83.CAL:Function_call_interrupts
778.67 ? 24% +69.0% 1316 ? 12% interrupts.CPU85.CAL:Function_call_interrupts
703.83 ? 19% +81.6% 1277 ? 11% interrupts.CPU86.CAL:Function_call_interrupts
674.00 ? 17% +68.9% 1138 ? 3% interrupts.CPU87.CAL:Function_call_interrupts
673.17 ? 10% +89.9% 1278 ? 17% interrupts.CPU88.CAL:Function_call_interrupts
700.17 ? 17% +66.1% 1162 ? 10% interrupts.CPU89.CAL:Function_call_interrupts
909.17 ? 37% -79.0% 190.67 ? 44% interrupts.CPU9.RES:Rescheduling_interrupts
671.67 ? 16% +78.8% 1200 ? 10% interrupts.CPU90.CAL:Function_call_interrupts
869.17 ? 28% -53.2% 407.17 ? 34% interrupts.CPU90.RES:Rescheduling_interrupts
714.33 ? 12% +71.6% 1226 ? 10% interrupts.CPU91.CAL:Function_call_interrupts
694.50 ? 11% +77.0% 1229 ? 7% interrupts.CPU92.CAL:Function_call_interrupts
694.67 ? 22% +70.0% 1181 ? 5% interrupts.CPU93.CAL:Function_call_interrupts
780.67 ? 26% -59.3% 318.00 ? 31% interrupts.CPU93.RES:Rescheduling_interrupts
629.50 ? 11% +81.7% 1143 ? 3% interrupts.CPU94.CAL:Function_call_interrupts
589.00 +99.0% 1172 ? 6% interrupts.CPU95.CAL:Function_call_interrupts
671.50 ? 30% -61.0% 261.67 ? 61% interrupts.CPU95.RES:Rescheduling_interrupts
83175 ? 6% -60.3% 33000 ? 8% interrupts.RES:Rescheduling_interrupts
10.72 ? 9% -7.4 3.31 ? 10% perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.70 ? 9% -7.4 3.30 ? 10% perf-profile.calltrace.cycles-pp.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.08 ? 9% -7.2 2.92 ? 10% perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
10.04 ? 9% -7.2 2.88 ? 10% perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.do_sys_open.do_syscall_64
6.46 ? 8% -6.5 0.00 perf-profile.calltrace.cycles-pp.__kmalloc_node.seq_read_iter.seq_read.vfs_read.ksys_read
5.52 ? 7% -5.5 0.00 perf-profile.calltrace.cycles-pp.kfree.single_release.__fput.task_work_run.exit_to_user_mode_prepare
5.45 ? 8% -5.5 0.00 perf-profile.calltrace.cycles-pp.obj_cgroup_charge.__kmalloc_node.seq_read_iter.seq_read.vfs_read
32.13 ? 7% -4.7 27.43 ? 8% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.51 ? 7% -2.9 0.65 ? 7% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.do_sys_open
3.49 ? 7% -2.8 0.64 ? 8% perf-profile.calltrace.cycles-pp.__alloc_file.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2
6.34 ? 7% -2.7 3.69 ? 11% perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.98 ? 7% -2.5 3.45 ? 10% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
5.15 ? 9% -2.4 2.73 ? 10% perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
5.49 ? 7% -2.2 3.30 ? 11% perf-profile.calltrace.cycles-pp.eventfd_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.98 ? 7% -2.0 2.98 ? 10% perf-profile.calltrace.cycles-pp.__wake_up_common.eventfd_write.vfs_write.ksys_write.do_syscall_64
4.58 ? 7% -1.8 2.74 ? 9% perf-profile.calltrace.cycles-pp.try_to_wake_up.__wake_up_common.eventfd_write.vfs_write.ksys_write
2.45 ? 7% -1.8 0.62 ? 13% perf-profile.calltrace.cycles-pp.do_dentry_open.path_openat.do_filp_open.do_sys_openat2.do_sys_open
4.47 ? 7% -1.8 2.68 ? 10% perf-profile.calltrace.cycles-pp.new_sync_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.28 ? 7% -1.7 2.54 ? 10% perf-profile.calltrace.cycles-pp.eventfd_read.new_sync_read.vfs_read.ksys_read.do_syscall_64
3.01 ? 7% -1.4 1.59 ? 8% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
3.10 ? 7% -1.3 1.76 ? 8% perf-profile.calltrace.cycles-pp.schedule.eventfd_read.new_sync_read.vfs_read.ksys_read
2.06 ? 13% -1.1 0.97 ? 10% perf-profile.calltrace.cycles-pp.link_path_walk.path_openat.do_filp_open.do_sys_openat2.do_sys_open
1.81 ? 13% -1.0 0.82 ? 10% perf-profile.calltrace.cycles-pp.walk_component.link_path_walk.path_openat.do_filp_open.do_sys_openat2
2.04 ? 6% -1.0 1.08 ? 8% perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.__wake_up_common.eventfd_write.vfs_write
1.96 ? 6% -0.9 1.04 ? 9% perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.__wake_up_common.eventfd_write
1.49 ? 6% -0.7 0.80 ? 9% perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.__wake_up_common
1.30 ? 15% -0.7 0.64 ? 9% perf-profile.calltrace.cycles-pp.lookup_fast.walk_component.link_path_walk.path_openat.do_filp_open
1.51 ? 8% -0.5 0.99 ? 10% perf-profile.calltrace.cycles-pp.seq_show.seq_read_iter.seq_read.vfs_read.ksys_read
0.00 +0.7 0.73 ? 8% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule_idle.do_idle.cpu_startup_entry
0.00 +0.8 0.78 ? 6% perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.eventfd_read
0.00 +0.9 0.88 ? 8% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.__vmalloc_node_range.__vmalloc_node.seq_read_iter.seq_read
0.00 +0.9 0.88 ? 6% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.eventfd_read.new_sync_read
0.00 +1.1 1.09 ? 7% perf-profile.calltrace.cycles-pp.insert_vmap_area.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node
0.00 +1.5 1.53 ? 8% perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
0.00 +1.7 1.68 ? 7% perf-profile.calltrace.cycles-pp.__schedule.schedule.eventfd_read.new_sync_read.vfs_read
0.00 +5.0 4.98 ? 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.find_vmap_area.__vunmap.single_release
0.00 +5.1 5.13 ? 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_vmap_area_noflush.remove_vm_area.__vunmap
0.00 +5.3 5.32 ? 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.find_vmap_area.__vunmap.single_release.__fput
0.00 +5.4 5.35 ? 7% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.remove_vm_area.__vunmap.single_release
0.00 +5.4 5.42 ? 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.free_vmap_area_noflush.remove_vm_area.__vunmap.single_release
0.00 +5.4 5.45 ? 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node
0.00 +5.6 5.57 ? 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range
14.60 ? 7% +5.6 20.17 ? 8% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +5.6 5.58 ? 7% perf-profile.calltrace.cycles-pp._raw_spin_lock.remove_vm_area.__vunmap.single_release.__fput
14.40 ? 7% +5.7 20.07 ? 8% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.00 +5.7 5.75 ? 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node.seq_read_iter
0.00 +6.0 6.05 ? 7% perf-profile.calltrace.cycles-pp.find_vmap_area.__vunmap.single_release.__fput.task_work_run
0.00 +6.1 6.12 ? 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node
0.00 +6.9 6.93 ? 8% perf-profile.calltrace.cycles-pp.free_vmap_area_noflush.remove_vm_area.__vunmap.single_release.__fput
0.00 +8.3 8.28 ? 8% perf-profile.calltrace.cycles-pp.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node.seq_read_iter
8.44 ? 7% +8.3 16.79 ? 8% perf-profile.calltrace.cycles-pp.seq_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.38 ? 7% +8.4 16.74 ? 8% perf-profile.calltrace.cycles-pp.seq_read_iter.seq_read.vfs_read.ksys_read.do_syscall_64
40.76 ? 7% +8.5 49.21 ? 8% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
8.41 ? 7% +13.3 21.66 ? 7% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
8.34 ? 7% +13.3 21.61 ? 7% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
7.90 ? 7% +13.5 21.36 ? 7% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
0.00 +13.5 13.54 ? 8% perf-profile.calltrace.cycles-pp.remove_vm_area.__vunmap.single_release.__fput.task_work_run
7.70 ? 7% +13.5 21.25 ? 7% perf-profile.calltrace.cycles-pp.__fput.task_work_run.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
6.94 ? 7% +13.9 20.82 ? 7% perf-profile.calltrace.cycles-pp.single_release.__fput.task_work_run.exit_to_user_mode_prepare.syscall_exit_to_user_mode
0.00 +14.2 14.19 ? 8% perf-profile.calltrace.cycles-pp.__get_vm_area_node.__vmalloc_node_range.__vmalloc_node.seq_read_iter.seq_read
0.00 +15.5 15.55 ? 8% perf-profile.calltrace.cycles-pp.__vmalloc_node_range.__vmalloc_node.seq_read_iter.seq_read.vfs_read
0.00 +15.6 15.55 ? 8% perf-profile.calltrace.cycles-pp.__vmalloc_node.seq_read_iter.seq_read.vfs_read.ksys_read
0.00 +20.5 20.47 ? 7% perf-profile.calltrace.cycles-pp.__vunmap.single_release.__fput.task_work_run.exit_to_user_mode_prepare
8.82 ? 7% -8.6 0.20 ? 9% perf-profile.children.cycles-pp.refill_obj_stock
10.72 ? 9% -7.4 3.31 ? 10% perf-profile.children.cycles-pp.do_sys_open
10.71 ? 9% -7.4 3.30 ? 10% perf-profile.children.cycles-pp.do_sys_openat2
10.09 ? 9% -7.2 2.92 ? 10% perf-profile.children.cycles-pp.do_filp_open
10.05 ? 9% -7.2 2.89 ? 10% perf-profile.children.cycles-pp.path_openat
7.18 ? 8% -7.0 0.13 ? 19% perf-profile.children.cycles-pp.obj_cgroup_charge
6.47 ? 8% -6.4 0.08 ? 20% perf-profile.children.cycles-pp.__kmalloc_node
6.24 ? 8% -6.1 0.10 ? 10% perf-profile.children.cycles-pp.drain_obj_stock
5.99 ? 7% -6.0 0.00 perf-profile.children.cycles-pp.__sched_text_start
5.53 ? 7% -5.4 0.16 ? 14% perf-profile.children.cycles-pp.kfree
32.20 ? 7% -4.7 27.51 ? 8% perf-profile.children.cycles-pp.do_syscall_64
4.35 ? 9% -4.2 0.18 ? 10% perf-profile.children.cycles-pp.page_counter_uncharge
4.33 ? 9% -4.1 0.18 ? 11% perf-profile.children.cycles-pp.__memcg_kmem_uncharge
4.31 ? 9% -4.1 0.17 ? 11% perf-profile.children.cycles-pp.page_counter_cancel
3.93 ? 8% -3.5 0.48 ? 9% perf-profile.children.cycles-pp.__memcg_kmem_charge
3.83 ? 8% -3.4 0.44 ? 9% perf-profile.children.cycles-pp.page_counter_try_charge
3.52 ? 7% -3.0 0.57 ? 10% perf-profile.children.cycles-pp.kmem_cache_alloc
3.51 ? 8% -2.9 0.65 ? 7% perf-profile.children.cycles-pp.alloc_empty_file
3.49 ? 8% -2.8 0.64 ? 8% perf-profile.children.cycles-pp.__alloc_file
6.35 ? 7% -2.7 3.70 ? 11% perf-profile.children.cycles-pp.ksys_write
6.00 ? 7% -2.5 3.47 ? 10% perf-profile.children.cycles-pp.vfs_write
5.17 ? 9% -2.4 2.78 ? 9% perf-profile.children.cycles-pp.poll_idle
5.49 ? 7% -2.2 3.31 ? 11% perf-profile.children.cycles-pp.eventfd_write
4.99 ? 7% -2.0 2.99 ? 10% perf-profile.children.cycles-pp.__wake_up_common
1.90 ? 9% -1.9 0.05 ? 45% perf-profile.children.cycles-pp.propagate_protected_usage
4.60 ? 7% -1.8 2.75 ? 10% perf-profile.children.cycles-pp.try_to_wake_up
2.45 ? 7% -1.8 0.62 ? 13% perf-profile.children.cycles-pp.do_dentry_open
4.48 ? 7% -1.8 2.69 ? 10% perf-profile.children.cycles-pp.new_sync_read
2.33 ? 5% -1.8 0.55 ? 11% perf-profile.children.cycles-pp.kmem_cache_free
4.32 ? 7% -1.7 2.57 ? 10% perf-profile.children.cycles-pp.eventfd_read
3.04 ? 7% -1.4 1.61 ? 8% perf-profile.children.cycles-pp.schedule_idle
3.12 ? 7% -1.3 1.78 ? 8% perf-profile.children.cycles-pp.schedule
1.35 ? 7% -1.2 0.11 ? 15% perf-profile.children.cycles-pp.get_obj_cgroup_from_current
1.53 ? 5% -1.2 0.36 ? 13% perf-profile.children.cycles-pp.single_open
2.06 ? 13% -1.1 0.97 ? 10% perf-profile.children.cycles-pp.link_path_walk
1.81 ? 13% -1.0 0.82 ? 11% perf-profile.children.cycles-pp.walk_component
1.28 ? 7% -1.0 0.31 ? 11% perf-profile.children.cycles-pp.security_file_permission
1.13 ? 8% -0.9 0.21 ? 15% perf-profile.children.cycles-pp.common_file_perm
2.05 ? 7% -0.9 1.17 ? 8% perf-profile.children.cycles-pp.ttwu_do_activate
1.98 ? 7% -0.8 1.13 ? 8% perf-profile.children.cycles-pp.enqueue_task_fair
1.62 ? 13% -0.8 0.85 ? 9% perf-profile.children.cycles-pp.lookup_fast
1.60 ? 6% -0.8 0.83 ? 7% perf-profile.children.cycles-pp.pick_next_task_fair
0.92 ? 7% -0.8 0.17 ? 15% perf-profile.children.cycles-pp.kmem_cache_alloc_trace
1.62 ? 6% -0.7 0.88 ? 5% perf-profile.children.cycles-pp.dequeue_task_fair
0.81 ? 10% -0.7 0.12 ? 13% perf-profile.children.cycles-pp.ima_file_check
0.80 ? 10% -0.7 0.11 ? 14% perf-profile.children.cycles-pp.security_task_getsecid
0.78 ? 10% -0.7 0.11 ? 14% perf-profile.children.cycles-pp.apparmor_task_getsecid
1.45 ? 7% -0.7 0.79 ? 6% perf-profile.children.cycles-pp.dequeue_entity
1.52 ? 7% -0.6 0.87 ? 10% perf-profile.children.cycles-pp.enqueue_entity
0.94 ? 13% -0.6 0.31 ? 17% perf-profile.children.cycles-pp.dput
0.99 ? 6% -0.6 0.37 ? 10% perf-profile.children.cycles-pp.rcu_do_batch
1.04 ? 6% -0.6 0.42 ? 10% perf-profile.children.cycles-pp.rcu_core
1.34 ? 6% -0.6 0.76 ? 10% perf-profile.children.cycles-pp.__softirqentry_text_start
0.81 ? 7% -0.6 0.24 ? 10% perf-profile.children.cycles-pp.smpboot_thread_fn
0.80 ? 7% -0.6 0.23 ? 10% perf-profile.children.cycles-pp.run_ksoftirqd
0.74 ? 10% -0.6 0.18 ? 14% perf-profile.children.cycles-pp.security_file_open
0.84 ? 6% -0.6 0.28 ? 11% perf-profile.children.cycles-pp.ret_from_fork
0.83 ? 7% -0.6 0.28 ? 11% perf-profile.children.cycles-pp.kthread
0.72 ? 10% -0.5 0.18 ? 13% perf-profile.children.cycles-pp.apparmor_file_open
1.52 ? 8% -0.5 0.99 ? 10% perf-profile.children.cycles-pp.seq_show
1.06 ? 7% -0.5 0.54 ? 8% perf-profile.children.cycles-pp.set_next_entity
1.13 ? 6% -0.5 0.68 ? 7% perf-profile.children.cycles-pp.update_load_avg
0.78 ? 20% -0.4 0.34 ? 13% perf-profile.children.cycles-pp.unlazy_child
0.60 ? 16% -0.4 0.17 ? 19% perf-profile.children.cycles-pp.terminate_walk
0.51 ? 10% -0.4 0.08 ? 20% perf-profile.children.cycles-pp.aa_get_task_label
1.07 ? 9% -0.4 0.64 ? 8% perf-profile.children.cycles-pp.seq_printf
0.60 ? 4% -0.4 0.18 ? 12% perf-profile.children.cycles-pp.seq_open
1.03 ? 9% -0.4 0.62 ? 8% perf-profile.children.cycles-pp.seq_vprintf
0.53 ? 12% -0.4 0.12 ? 26% perf-profile.children.cycles-pp.lockref_put_return
1.01 ? 9% -0.4 0.61 ? 8% perf-profile.children.cycles-pp.vsnprintf
0.81 ? 7% -0.4 0.43 ? 7% perf-profile.children.cycles-pp.update_curr
0.67 ? 9% -0.4 0.30 ? 7% perf-profile.children.cycles-pp.security_file_alloc
0.90 ? 8% -0.4 0.53 ? 7% perf-profile.children.cycles-pp.update_rq_clock
0.57 ? 8% -0.3 0.24 ? 6% perf-profile.children.cycles-pp.apparmor_file_alloc_security
0.52 ? 25% -0.3 0.19 ? 20% perf-profile.children.cycles-pp.step_into
0.58 ? 25% -0.3 0.26 ? 16% perf-profile.children.cycles-pp.lockref_get_not_dead
0.90 ? 9% -0.3 0.61 ? 21% perf-profile.children.cycles-pp._raw_spin_lock_irq
0.37 ? 18% -0.3 0.09 ? 15% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
0.33 ? 30% -0.3 0.08 ? 17% perf-profile.children.cycles-pp.__legitimize_path
0.29 ? 26% -0.3 0.04 ? 71% perf-profile.children.cycles-pp.__mod_memcg_state
0.49 ? 8% -0.2 0.25 ? 9% perf-profile.children.cycles-pp.security_file_free
0.48 ? 8% -0.2 0.25 ? 9% perf-profile.children.cycles-pp.apparmor_file_free_security
0.48 ? 8% -0.2 0.27 ? 9% perf-profile.children.cycles-pp.__switch_to
0.57 ? 7% -0.2 0.36 ? 14% perf-profile.children.cycles-pp._copy_to_iter
0.47 ? 8% -0.2 0.27 ? 8% perf-profile.children.cycles-pp.syscall_return_via_sysret
0.67 ? 9% -0.2 0.47 ? 20% perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
0.47 ? 8% -0.2 0.28 ? 15% perf-profile.children.cycles-pp.tick_nohz_idle_exit
0.43 ? 9% -0.2 0.25 ? 8% perf-profile.children.cycles-pp.__switch_to_asm
0.35 ? 8% -0.2 0.17 ? 12% perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
0.45 ? 6% -0.2 0.27 ? 16% perf-profile.children.cycles-pp.__fdget_pos
0.33 ? 5% -0.2 0.16 ? 8% perf-profile.children.cycles-pp.pick_next_entity
0.40 ? 7% -0.2 0.24 ? 18% perf-profile.children.cycles-pp.__fget_light
0.30 ? 11% -0.2 0.15 ? 13% perf-profile.children.cycles-pp.update_cfs_group
0.32 ? 9% -0.1 0.18 ? 11% perf-profile.children.cycles-pp.switch_fpu_return
0.35 ? 9% -0.1 0.21 ? 10% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.30 ? 6% -0.1 0.16 ? 11% perf-profile.children.cycles-pp.switch_mm_irqs_off
0.34 ? 10% -0.1 0.20 ? 10% perf-profile.children.cycles-pp.check_preempt_curr
0.34 ? 9% -0.1 0.20 ? 7% perf-profile.children.cycles-pp.__update_load_avg_se
0.41 ? 5% -0.1 0.27 ? 11% perf-profile.children.cycles-pp.eventfd_show_fdinfo
0.39 ? 8% -0.1 0.25 ? 17% perf-profile.children.cycles-pp.__might_fault
0.47 ? 8% -0.1 0.33 ? 13% perf-profile.children.cycles-pp.___might_sleep
0.31 ? 7% -0.1 0.18 ? 10% perf-profile.children.cycles-pp.available_idle_cpu
0.32 ? 10% -0.1 0.18 ? 11% perf-profile.children.cycles-pp.pid_revalidate
0.41 ? 6% -0.1 0.27 ? 11% perf-profile.children.cycles-pp.sched_clock_cpu
0.27 ? 9% -0.1 0.14 ? 12% perf-profile.children.cycles-pp.__x64_sys_close
0.37 ? 8% -0.1 0.24 ? 15% perf-profile.children.cycles-pp.getname_flags
0.28 ? 9% -0.1 0.15 ? 9% perf-profile.children.cycles-pp.reweight_entity
0.19 ? 12% -0.1 0.06 ? 51% perf-profile.children.cycles-pp.cpuacct_charge
0.21 ? 5% -0.1 0.09 ? 10% perf-profile.children.cycles-pp.__check_object_size
0.37 ? 6% -0.1 0.25 ? 10% perf-profile.children.cycles-pp.sched_clock
0.46 ? 6% -0.1 0.34 ? 15% perf-profile.children.cycles-pp.select_task_rq_fair
0.22 ? 9% -0.1 0.10 ? 12% perf-profile.children.cycles-pp.perf_tp_event
0.32 ? 9% -0.1 0.20 ? 21% perf-profile.children.cycles-pp.update_ts_time_stats
0.29 ? 7% -0.1 0.18 ? 6% perf-profile.children.cycles-pp.format_decode
0.35 ? 6% -0.1 0.23 ? 12% perf-profile.children.cycles-pp.native_sched_clock
0.29 ? 11% -0.1 0.17 ? 9% perf-profile.children.cycles-pp.number
0.32 ? 9% -0.1 0.21 ? 20% perf-profile.children.cycles-pp.nr_iowait_cpu
0.20 ? 10% -0.1 0.10 ? 12% perf-profile.children.cycles-pp.filp_close
0.25 ? 11% -0.1 0.16 ? 9% perf-profile.children.cycles-pp.__entry_text_start
0.21 ? 13% -0.1 0.12 ? 16% perf-profile.children.cycles-pp.finish_task_switch
0.25 ? 11% -0.1 0.15 ? 12% perf-profile.children.cycles-pp.tid_fd_revalidate
0.15 ? 14% -0.1 0.06 ? 47% perf-profile.children.cycles-pp.rcu_read_unlock_strict
0.19 ? 9% -0.1 0.10 ? 10% perf-profile.children.cycles-pp.fput_many
0.23 ? 8% -0.1 0.14 ? 12% perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.19 ? 9% -0.1 0.11 ? 18% perf-profile.children.cycles-pp.task_dump_owner
0.20 ? 7% -0.1 0.11 ? 13% perf-profile.children.cycles-pp.strncpy_from_user
0.10 ? 14% -0.1 0.03 ? 99% perf-profile.children.cycles-pp.alloc_fd
0.23 ? 12% -0.1 0.15 ? 12% perf-profile.children.cycles-pp.resched_curr
0.20 ? 10% -0.1 0.13 ? 14% perf-profile.children.cycles-pp.memcpy_erms
0.18 ? 8% -0.1 0.10 ? 14% perf-profile.children.cycles-pp.inode_permission
0.18 ? 4% -0.1 0.11 ? 17% perf-profile.children.cycles-pp.copy_fpregs_to_fpstate
0.10 ? 27% -0.1 0.03 ?100% perf-profile.children.cycles-pp.legitimize_mnt
0.13 ? 24% -0.1 0.06 ? 14% perf-profile.children.cycles-pp.__legitimize_mnt
0.15 ? 9% -0.1 0.08 ? 19% perf-profile.children.cycles-pp.pid_update_inode
0.20 ? 9% -0.1 0.13 ? 10% perf-profile.children.cycles-pp.rcu_idle_exit
0.30 ? 5% -0.1 0.23 ? 11% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
0.15 ? 15% -0.1 0.09 ? 8% perf-profile.children.cycles-pp.__wrgsbase_inactive
0.12 ? 11% -0.1 0.06 ? 14% perf-profile.children.cycles-pp.task_work_add
0.16 ? 8% -0.1 0.10 ? 12% perf-profile.children.cycles-pp.copyout
0.27 ? 5% -0.1 0.21 ? 16% perf-profile.children.cycles-pp.read_tsc
0.19 ? 7% -0.1 0.14 ? 10% perf-profile.children.cycles-pp.place_entity
0.10 ? 4% -0.1 0.05 ? 45% perf-profile.children.cycles-pp.rcu_eqs_enter
0.09 ? 5% -0.1 0.04 ? 71% perf-profile.children.cycles-pp.aa_file_perm
0.09 ? 7% -0.1 0.04 ? 45% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.13 ? 7% -0.0 0.08 ? 21% perf-profile.children.cycles-pp.___perf_sw_event
0.14 ? 3% -0.0 0.09 ? 12% perf-profile.children.cycles-pp._copy_from_user
0.11 ? 13% -0.0 0.07 ? 7% perf-profile.children.cycles-pp.call_rcu
0.10 ? 7% -0.0 0.05 ? 47% perf-profile.children.cycles-pp.update_min_vruntime
0.13 ? 7% -0.0 0.09 ? 18% perf-profile.children.cycles-pp.fsnotify
0.15 ? 9% -0.0 0.11 ? 11% perf-profile.children.cycles-pp.get_next_timer_interrupt
0.09 ? 7% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.put_prev_task_fair
0.14 ? 15% -0.0 0.09 ? 14% perf-profile.children.cycles-pp.hrtimer_next_event_without
0.11 ? 10% -0.0 0.07 ? 14% perf-profile.children.cycles-pp.__d_lookup
0.11 ? 6% -0.0 0.07 ? 17% perf-profile.children.cycles-pp.copy_user_generic_unrolled
0.12 ? 7% -0.0 0.08 ? 13% perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
0.10 ? 12% -0.0 0.06 ? 13% perf-profile.children.cycles-pp.lockref_put_or_lock
0.07 ? 6% -0.0 0.04 ? 73% perf-profile.children.cycles-pp.proc_pid_permission
0.11 ? 9% -0.0 0.07 ? 12% perf-profile.children.cycles-pp.rcu_eqs_exit
0.09 ? 13% -0.0 0.05 ? 46% perf-profile.children.cycles-pp.__fsnotify_parent
0.09 ? 13% -0.0 0.05 ? 8% perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
0.10 ? 10% -0.0 0.06 ? 11% perf-profile.children.cycles-pp.__calc_delta
0.09 ? 8% -0.0 0.05 ? 9% perf-profile.children.cycles-pp.call_cpuidle
0.08 ? 5% -0.0 0.05 ? 8% perf-profile.children.cycles-pp.__rdgsbase_inactive
0.10 ? 13% -0.0 0.07 ? 17% perf-profile.children.cycles-pp.tid_fd_mode
0.09 ? 10% -0.0 0.07 ? 16% perf-profile.children.cycles-pp.hrtimer_get_next_event
0.03 ?100% +0.0 0.07 ? 16% perf-profile.children.cycles-pp.__list_del_entry_valid
0.00 +0.1 0.05 ? 8% perf-profile.children.cycles-pp.kmem_cache_alloc_node_trace
0.00 +0.1 0.06 ? 6% perf-profile.children.cycles-pp.map_kernel_range_noflush
0.00 +0.1 0.07 ? 14% perf-profile.children.cycles-pp.get_page_from_freelist
0.09 ? 23% +0.1 0.17 ? 28% perf-profile.children.cycles-pp.ktime_get_update_offsets_now
0.00 +0.1 0.13 ? 9% perf-profile.children.cycles-pp.rb_insert_color
0.15 ? 26% +0.1 0.29 ? 55% perf-profile.children.cycles-pp.irq_enter_rcu
0.00 +0.2 0.16 ? 14% perf-profile.children.cycles-pp.__memcg_kmem_uncharge_page
0.00 +0.2 0.17 ? 12% perf-profile.children.cycles-pp.free_pcp_prepare
0.00 +0.2 0.21 ? 15% perf-profile.children.cycles-pp.free_unref_page
0.00 +0.3 0.31 ? 8% perf-profile.children.cycles-pp.__free_pages
0.00 +0.4 0.39 ? 9% perf-profile.children.cycles-pp.rb_erase
0.00 +0.4 0.40 ? 11% perf-profile.children.cycles-pp.unmap_kernel_range_noflush
0.00 +0.5 0.49 ? 8% perf-profile.children.cycles-pp.__memcg_kmem_charge_page
0.00 +0.9 0.88 ? 8% perf-profile.children.cycles-pp.__alloc_pages_nodemask
0.00 +1.1 1.09 ? 6% perf-profile.children.cycles-pp.insert_vmap_area
0.00 +3.3 3.29 ? 7% perf-profile.children.cycles-pp.__schedule
14.61 ? 7% +5.6 20.19 ? 8% perf-profile.children.cycles-pp.ksys_read
14.42 ? 7% +5.7 20.09 ? 8% perf-profile.children.cycles-pp.vfs_read
0.00 +6.0 6.05 ? 7% perf-profile.children.cycles-pp.find_vmap_area
0.00 +6.9 6.93 ? 8% perf-profile.children.cycles-pp.free_vmap_area_noflush
0.00 +8.3 8.29 ? 8% perf-profile.children.cycles-pp.alloc_vmap_area
8.44 ? 7% +8.4 16.79 ? 8% perf-profile.children.cycles-pp.seq_read
8.39 ? 7% +8.4 16.75 ? 8% perf-profile.children.cycles-pp.seq_read_iter
40.84 ? 7% +8.5 49.29 ? 8% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
8.43 ? 7% +13.2 21.67 ? 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
8.35 ? 7% +13.3 21.62 ? 7% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
7.91 ? 7% +13.5 21.37 ? 7% perf-profile.children.cycles-pp.task_work_run
7.72 ? 7% +13.5 21.26 ? 7% perf-profile.children.cycles-pp.__fput
0.00 +13.6 13.55 ? 8% perf-profile.children.cycles-pp.remove_vm_area
6.94 ? 7% +13.9 20.82 ? 7% perf-profile.children.cycles-pp.single_release
0.00 +14.2 14.20 ? 8% perf-profile.children.cycles-pp.__get_vm_area_node
0.00 +15.6 15.56 ? 8% perf-profile.children.cycles-pp.__vmalloc_node
0.00 +15.6 15.56 ? 8% perf-profile.children.cycles-pp.__vmalloc_node_range
0.00 +20.5 20.48 ? 7% perf-profile.children.cycles-pp.__vunmap
0.00 +26.5 26.50 ? 8% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.71 ? 7% +27.9 28.63 ? 8% perf-profile.children.cycles-pp._raw_spin_lock
2.98 ? 9% -2.8 0.14 ? 14% perf-profile.self.cycles-pp.page_counter_cancel
3.22 ? 8% -2.8 0.42 ? 9% perf-profile.self.cycles-pp.page_counter_try_charge
2.51 ? 7% -2.4 0.10 ? 12% perf-profile.self.cycles-pp.refill_obj_stock
5.07 ? 9% -2.3 2.72 ? 9% perf-profile.self.cycles-pp.poll_idle
1.88 ? 9% -1.9 0.03 ? 70% perf-profile.self.cycles-pp.propagate_protected_usage
1.30 ? 7% -1.2 0.09 ? 16% perf-profile.self.cycles-pp.get_obj_cgroup_from_current
1.04 ? 8% -0.9 0.16 ? 16% perf-profile.self.cycles-pp.common_file_perm
0.87 ? 6% -0.8 0.11 ? 15% perf-profile.self.cycles-pp.kfree
0.96 ? 7% -0.7 0.23 ? 10% perf-profile.self.cycles-pp.kmem_cache_alloc
0.86 ? 4% -0.6 0.30 ? 11% perf-profile.self.cycles-pp.kmem_cache_free
0.72 ? 10% -0.5 0.17 ? 12% perf-profile.self.cycles-pp.apparmor_file_open
0.50 ? 10% -0.4 0.08 ? 16% perf-profile.self.cycles-pp.aa_get_task_label
0.52 ? 13% -0.4 0.12 ? 26% perf-profile.self.cycles-pp.lockref_put_return
0.55 ? 8% -0.3 0.23 ? 8% perf-profile.self.cycles-pp.apparmor_file_alloc_security
0.72 ? 8% -0.3 0.40 ? 8% perf-profile.self.cycles-pp.update_rq_clock
0.57 ? 26% -0.3 0.26 ? 15% perf-profile.self.cycles-pp.lockref_get_not_dead
0.64 ? 7% -0.3 0.33 ? 8% perf-profile.self.cycles-pp.set_next_entity
0.86 ? 10% -0.3 0.59 ? 21% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.35 ? 9% -0.3 0.08 ? 19% perf-profile.self.cycles-pp.kmem_cache_alloc_trace
0.33 ? 15% -0.3 0.07 ? 24% perf-profile.self.cycles-pp.obj_cgroup_charge
0.53 ? 7% -0.3 0.28 ? 7% perf-profile.self.cycles-pp.update_load_avg
0.29 ? 26% -0.3 0.04 ? 71% perf-profile.self.cycles-pp.__mod_memcg_state
0.48 ? 8% -0.2 0.24 ? 9% perf-profile.self.cycles-pp.apparmor_file_free_security
0.53 ? 8% -0.2 0.31 ? 10% perf-profile.self.cycles-pp.do_idle
0.46 ? 8% -0.2 0.26 ? 4% perf-profile.self.cycles-pp.enqueue_task_fair
0.41 ? 5% -0.2 0.20 ? 16% perf-profile.self.cycles-pp.enqueue_entity
0.47 ? 8% -0.2 0.27 ? 7% perf-profile.self.cycles-pp.syscall_return_via_sysret
0.44 ? 8% -0.2 0.25 ? 9% perf-profile.self.cycles-pp.__switch_to
0.43 ? 9% -0.2 0.25 ? 8% perf-profile.self.cycles-pp.__switch_to_asm
0.38 ? 7% -0.2 0.20 ? 6% perf-profile.self.cycles-pp.update_curr
0.50 ? 8% -0.2 0.31 ? 19% perf-profile.self.cycles-pp.vfs_read
0.32 ? 5% -0.2 0.15 ? 8% perf-profile.self.cycles-pp.pick_next_entity
0.39 ? 7% -0.2 0.24 ? 16% perf-profile.self.cycles-pp.__fget_light
0.30 ? 12% -0.2 0.15 ? 14% perf-profile.self.cycles-pp.update_cfs_group
0.18 ? 57% -0.1 0.04 ? 71% perf-profile.self.cycles-pp.dput
0.32 ? 8% -0.1 0.17 ? 12% perf-profile.self.cycles-pp.switch_fpu_return
0.36 ? 9% -0.1 0.22 ? 17% perf-profile.self.cycles-pp.__wake_up_common
0.30 ? 6% -0.1 0.16 ? 11% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.31 ? 7% -0.1 0.17 ? 9% perf-profile.self.cycles-pp.available_idle_cpu
0.28 ? 9% -0.1 0.15 ? 8% perf-profile.self.cycles-pp.reweight_entity
0.33 ? 10% -0.1 0.19 ? 7% perf-profile.self.cycles-pp.__update_load_avg_se
0.46 ? 8% -0.1 0.33 ? 14% perf-profile.self.cycles-pp.___might_sleep
0.19 ? 12% -0.1 0.06 ? 51% perf-profile.self.cycles-pp.cpuacct_charge
0.34 ? 6% -0.1 0.22 ? 11% perf-profile.self.cycles-pp.native_sched_clock
0.19 ? 10% -0.1 0.08 ? 15% perf-profile.self.cycles-pp.perf_tp_event
0.32 ? 9% -0.1 0.21 ? 21% perf-profile.self.cycles-pp.nr_iowait_cpu
0.27 ? 8% -0.1 0.17 ? 9% perf-profile.self.cycles-pp.try_to_wake_up
0.27 ? 6% -0.1 0.16 ? 5% perf-profile.self.cycles-pp.format_decode
0.26 ? 10% -0.1 0.16 ? 9% perf-profile.self.cycles-pp.number
0.21 ? 9% -0.1 0.11 ? 12% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
0.24 ? 13% -0.1 0.14 ? 11% perf-profile.self.cycles-pp.vsnprintf
0.25 ? 11% -0.1 0.16 ? 9% perf-profile.self.cycles-pp.__entry_text_start
0.17 ? 4% -0.1 0.08 ? 7% perf-profile.self.cycles-pp.dequeue_entity
0.13 ? 26% -0.1 0.06 ? 46% perf-profile.self.cycles-pp.__legitimize_mnt
0.23 ? 12% -0.1 0.15 ? 12% perf-profile.self.cycles-pp.resched_curr
0.18 ? 4% -0.1 0.11 ? 14% perf-profile.self.cycles-pp.copy_fpregs_to_fpstate
0.10 ? 17% -0.1 0.03 ?100% perf-profile.self.cycles-pp.rcu_read_unlock_strict
0.19 ? 11% -0.1 0.12 ? 12% perf-profile.self.cycles-pp.memcpy_erms
0.10 ? 14% -0.1 0.03 ?101% perf-profile.self.cycles-pp.check_preempt_curr
0.31 ? 7% -0.1 0.24 ? 9% perf-profile.self.cycles-pp.eventfd_read
0.14 ? 6% -0.1 0.07 ? 15% perf-profile.self.cycles-pp.dequeue_task_fair
0.16 ? 13% -0.1 0.09 ? 19% perf-profile.self.cycles-pp.finish_task_switch
0.10 ? 14% -0.1 0.04 ? 71% perf-profile.self.cycles-pp.pick_next_task_fair
0.15 ? 15% -0.1 0.09 ? 8% perf-profile.self.cycles-pp.__wrgsbase_inactive
0.27 ? 6% -0.1 0.21 ? 9% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
0.12 ? 9% -0.1 0.06 ? 19% perf-profile.self.cycles-pp.task_work_add
0.26 ? 5% -0.1 0.20 ? 16% perf-profile.self.cycles-pp.read_tsc
0.09 ? 7% -0.1 0.04 ? 71% perf-profile.self.cycles-pp.lookup_fast
0.08 ? 11% -0.1 0.03 ?100% perf-profile.self.cycles-pp.strncpy_from_user
0.09 ? 5% -0.1 0.04 ? 71% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
0.08 ? 14% -0.1 0.03 ?100% perf-profile.self.cycles-pp.__fsnotify_parent
0.19 ? 8% -0.1 0.14 ? 10% perf-profile.self.cycles-pp.place_entity
0.10 ? 13% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.path_openat
0.08 ? 7% -0.0 0.03 ? 70% perf-profile.self.cycles-pp.pid_revalidate
0.15 ? 8% -0.0 0.11 ? 10% perf-profile.self.cycles-pp.new_sync_read
0.10 ? 6% -0.0 0.05 ? 46% perf-profile.self.cycles-pp.copy_user_generic_unrolled
0.11 ? 4% -0.0 0.06 ? 14% perf-profile.self.cycles-pp.seq_show
0.09 ? 5% -0.0 0.05 ? 47% perf-profile.self.cycles-pp.update_min_vruntime
0.10 ? 8% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.link_path_walk
0.09 ? 14% -0.0 0.05 ? 45% perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
0.13 ? 8% -0.0 0.09 ? 17% perf-profile.self.cycles-pp.fsnotify
0.12 ? 8% -0.0 0.08 ? 11% perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
0.08 ? 8% -0.0 0.05 ? 45% perf-profile.self.cycles-pp.call_cpuidle
0.10 ? 8% -0.0 0.06 ? 19% perf-profile.self.cycles-pp.___perf_sw_event
0.09 ? 12% -0.0 0.06 ? 8% perf-profile.self.cycles-pp.rcu_idle_exit
0.10 ? 12% -0.0 0.06 ? 11% perf-profile.self.cycles-pp.__calc_delta
0.08 ? 5% -0.0 0.05 ? 8% perf-profile.self.cycles-pp.__rdgsbase_inactive
0.09 ? 5% -0.0 0.06 ? 19% perf-profile.self.cycles-pp.exit_to_user_mode_prepare
0.00 +0.1 0.06 ? 6% perf-profile.self.cycles-pp.map_kernel_range_noflush
0.07 ? 27% +0.1 0.15 ? 28% perf-profile.self.cycles-pp.ktime_get_update_offsets_now
0.00 +0.1 0.11 ? 12% perf-profile.self.cycles-pp.__get_vm_area_node
0.00 +0.1 0.13 ? 9% perf-profile.self.cycles-pp.rb_insert_color
0.00 +0.3 0.31 ? 8% perf-profile.self.cycles-pp.__alloc_pages_nodemask
0.00 +0.3 0.31 ? 8% perf-profile.self.cycles-pp.__free_pages
0.00 +0.3 0.31 ? 9% perf-profile.self.cycles-pp.__vmalloc_node_range
0.00 +0.3 0.34 ? 11% perf-profile.self.cycles-pp.__vunmap
0.00 +0.4 0.38 ? 10% perf-profile.self.cycles-pp.unmap_kernel_range_noflush
0.00 +0.4 0.39 ? 7% perf-profile.self.cycles-pp.rb_erase
0.00 +0.6 0.61 ? 8% perf-profile.self.cycles-pp.remove_vm_area
0.00 +0.7 0.68 ? 11% perf-profile.self.cycles-pp.__schedule
0.00 +0.7 0.72 ? 6% perf-profile.self.cycles-pp.find_vmap_area
0.00 +1.0 0.96 ? 6% perf-profile.self.cycles-pp.insert_vmap_area
0.00 +1.0 1.02 ? 13% perf-profile.self.cycles-pp.alloc_vmap_area
0.00 +1.0 1.02 ? 8% perf-profile.self.cycles-pp.free_vmap_area_noflush
0.69 ? 7% +1.4 2.10 ? 8% perf-profile.self.cycles-pp._raw_spin_lock
0.00 +26.3 26.26 ? 8% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath



stress-ng.time.user_time

45 +----------------------------------------------------------------------+
| |
40 |-+ +. |
| .+.+.+. + +.+.+.+ |
| .+.+ + : |
35 |.+.+.+.+.+ : |
| +..+.+.+.+.+.+.+.+.+. .+.+.|
30 |-+ +.+.+.+.+.+ |
| |
25 |-+ |
| |
| |
20 |-+ O O O O O O |
| O O O O O O O O O O O O O O O O O O O O |
15 +----------------------------------------------------------------------+


stress-ng.time.voluntary_context_switches

7.5e+07 +-----------------------------------------------------------------+
| .+.+. .+. .+. .+. .+. |
7e+07 |.+ ++ + + + ++.+.+. |
6.5e+07 |-+ +.+.+. .++.+.+. .+.+. .+.++.+.+.+.|
| + + + |
6e+07 |-+ |
5.5e+07 |-+ |
| |
5e+07 |-+ |
4.5e+07 |-+ |
| |
4e+07 |-+ |
3.5e+07 |-+ O O |
| O O O O O O O O O O OO O O O O O O OO O O O O O O O OO O O O |
3e+07 +-----------------------------------------------------------------+


stress-ng.eventfd.ops

3.6e+07 +-----------------------------------------------------------------+
3.4e+07 |.+ ++ +.+.+.+.+.+.++.+.+. |
| +. .+. .++. .+. .+.+. .+ .+.+.+.|
3.2e+07 |-+ + + + + +.+ + |
3e+07 |-+ |
| |
2.8e+07 |-+ |
2.6e+07 |-+ |
2.4e+07 |-+ |
| |
2.2e+07 |-+ |
2e+07 |-+ |
| O |
1.8e+07 |-+ O OO O O O O O O O O O O |
1.6e+07 +-----------------------------------------------------------------+


stress-ng.eventfd.ops_per_sec

600000 +------------------------------------------------------------------+
|.+ +.+ +.+.+.+.+.+.+.+.+.+. |
550000 |-+ ++.+. .+.+.+.+. .+.+. +. .+.+.+.|
| + + +.+ + |
500000 |-+ |
| |
450000 |-+ |
| |
400000 |-+ |
| |
350000 |-+ |
| |
300000 |-+ O O |
| O O O O O O O O O O O O O O OO O O O O O O O O O O OO O O O O |
250000 +------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (73.56 kB)
config-5.11.0-00004-g5fd6060e506c (175.11 kB)
job-script (8.23 kB)
job.yaml (5.66 kB)
reproduce (552.00 B)
Download all attachments

2021-03-19 19:33:50

by Kees Cook

[permalink] [raw]
Subject: Re: [seq_file] 5fd6060e50: stress-ng.eventfd.ops_per_sec -49.1% regression

On Fri, Mar 19, 2021 at 10:07:42PM +0800, kernel test robot wrote:
> FYI, we noticed a -49.1% regression of stress-ng.eventfd.ops_per_sec due to commit:

Well, so it can be seen. ;) Though I feel slightly better that it's stress-ng
instead of a "normal" workload.

Thanks for the report!

--
Kees Cook