From: wangyong <[email protected]>
Use the global variable num_events to record the number of vmpressure
events registered by the system, and wake up work only when there is
registration event.
Usually, the vmpressure event is not registered in the system, this patch
can avoid waking up work and doing nothing.
Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
Consume cgroup memory until it is about to be reclaimed, then execute
"perf stat -I 2000 malloc.out" command to trigger memory reclamation
and get performance results.
The context-switches is reduced by about 20 times.
unpatched:
Average of 10 test results
582.4674048 task-clock(msec)
19910.8 context-switches
0 cpu-migrations
1292.9 page-faults
414784733.1 cycles
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
580070698.4 instructions
125572244.7 branches
2073541.2 branch-misses
patched
Average of 10 test results
973.6174796 task-clock(msec)
988.6 context-switches
0 cpu-migrations
1785.2 page-faults
772883602.4 cycles
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
1360280911 instructions
290519434.9 branches
3378378.2 branch-misses
Tested-by: Zeal Robot <[email protected]>
Signed-off-by: wangyong <[email protected]>
---
Changes since v1:
-Use static_key type data as global variable
-Make event registration judgment earlier
mm/vmpressure.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 76518e4..6f4e984 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -67,6 +67,11 @@ static const unsigned int vmpressure_level_critical = 95;
*/
static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10);
+/*
+ * Count the number of vmpressure events registered in the system.
+ */
+DEFINE_STATIC_KEY_FALSE(num_events);
+
static struct vmpressure *work_to_vmpressure(struct work_struct *work)
{
return container_of(work, struct vmpressure, work);
@@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
return;
if (tree) {
+ if (!static_branch_unlikely(&num_events))
+ return;
+
spin_lock(&vmpr->sr_lock);
scanned = vmpr->tree_scanned += scanned;
vmpr->tree_reclaimed += reclaimed;
@@ -407,6 +415,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg,
mutex_lock(&vmpr->events_lock);
list_add(&ev->node, &vmpr->events);
mutex_unlock(&vmpr->events_lock);
+ static_branch_inc(&num_events);
ret = 0;
out:
kfree(spec_orig);
@@ -435,6 +444,7 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg,
if (ev->efd != eventfd)
continue;
list_del(&ev->node);
+ static_branch_dec(&num_events);
kfree(ev);
break;
}
--
2.7.4
[email protected] writes:
>From: wangyong <[email protected]>
>
>Use the global variable num_events to record the number of vmpressure
>events registered by the system, and wake up work only when there is
>registration event.
>Usually, the vmpressure event is not registered in the system, this patch
>can avoid waking up work and doing nothing.
>
>Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
>Consume cgroup memory until it is about to be reclaimed, then execute
>"perf stat -I 2000 malloc.out" command to trigger memory reclamation
>and get performance results.
>The context-switches is reduced by about 20 times.
>
>unpatched:
>Average of 10 test results
>582.4674048 task-clock(msec)
>19910.8 context-switches
>0 cpu-migrations
>1292.9 page-faults
>414784733.1 cycles
><not supported> stalled-cycles-frontend
><not supported> stalled-cycles-backend
>580070698.4 instructions
>125572244.7 branches
>2073541.2 branch-misses
>
>patched
>Average of 10 test results
>973.6174796 task-clock(msec)
>988.6 context-switches
>0 cpu-migrations
>1785.2 page-faults
>772883602.4 cycles
><not supported> stalled-cycles-frontend
><not supported> stalled-cycles-backend
>1360280911 instructions
>290519434.9 branches
>3378378.2 branch-misses
>
>Tested-by: Zeal Robot <[email protected]>
That's not how Tested-by works. Tested-by is for human testers who have
actively understand and have validated the effects of the code, not CI: please
remove the tag.
>Signed-off-by: wangyong <[email protected]>
>---
>
>Changes since v1:
>-Use static_key type data as global variable
>-Make event registration judgment earlier
>
> mm/vmpressure.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
>diff --git a/mm/vmpressure.c b/mm/vmpressure.c
>index 76518e4..6f4e984 100644
>--- a/mm/vmpressure.c
>+++ b/mm/vmpressure.c
>@@ -67,6 +67,11 @@ static const unsigned int vmpressure_level_critical = 95;
> */
> static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10);
>
>+/*
>+ * Count the number of vmpressure events registered in the system.
>+ */
>+DEFINE_STATIC_KEY_FALSE(num_events);
>+
> static struct vmpressure *work_to_vmpressure(struct work_struct *work)
> {
> return container_of(work, struct vmpressure, work);
>@@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> return;
>
> if (tree) {
>+ if (!static_branch_unlikely(&num_events))
>+ return;
>+
> spin_lock(&vmpr->sr_lock);
> scanned = vmpr->tree_scanned += scanned;
> vmpr->tree_reclaimed += reclaimed;
>@@ -407,6 +415,7 @@ int vmpressure_register_event(struct mem_cgroup *memcg,
> mutex_lock(&vmpr->events_lock);
> list_add(&ev->node, &vmpr->events);
> mutex_unlock(&vmpr->events_lock);
>+ static_branch_inc(&num_events);
> ret = 0;
> out:
> kfree(spec_orig);
>@@ -435,6 +444,7 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg,
> if (ev->efd != eventfd)
> continue;
> list_del(&ev->node);
>+ static_branch_dec(&num_events);
> kfree(ev);
> break;
> }
>--
>2.7.4
>
>
On Tue 14-09-21 09:05:51, [email protected] wrote:
> From: wangyong <[email protected]>
>
> Use the global variable num_events to record the number of vmpressure
> events registered by the system, and wake up work only when there is
> registration event.
> Usually, the vmpressure event is not registered in the system, this patch
> can avoid waking up work and doing nothing.
I have asked in the previous version and this changelog doesn't that
explain again. Why don't you simply bail out early in vmpressure()
entry?
> Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
> Consume cgroup memory until it is about to be reclaimed, then execute
> "perf stat -I 2000 malloc.out" command to trigger memory reclamation
> and get performance results.
> The context-switches is reduced by about 20 times.
Is this test somewhere available so that it can be reproduced by
others. Also while the number of context switches can be an interesting
it is not really clear from this evaluation whether that actually
matters or not. E.g. what does an increase of task-clock and twice as
many instructions recorded tell us?
> unpatched:
> Average of 10 test results
> 582.4674048 task-clock(msec)
> 19910.8 context-switches
> 0 cpu-migrations
> 1292.9 page-faults
> 414784733.1 cycles
> <not supported> stalled-cycles-frontend
> <not supported> stalled-cycles-backend
Why is this a part of the data?
> 580070698.4 instructions
> 125572244.7 branches
> 2073541.2 branch-misses
>
> patched
> Average of 10 test results
> 973.6174796 task-clock(msec)
> 988.6 context-switches
> 0 cpu-migrations
> 1785.2 page-faults
> 772883602.4 cycles
> <not supported> stalled-cycles-frontend
> <not supported> stalled-cycles-backend
> 1360280911 instructions
> 290519434.9 branches
> 3378378.2 branch-misses
>
> Tested-by: Zeal Robot <[email protected]>
> Signed-off-by: wangyong <[email protected]>
> ---
>
[...]
> @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> return;
>
> if (tree) {
> + if (!static_branch_unlikely(&num_events))
> + return;
We usually hide the change behind a static inline helper (e.g.
vmpressure_disabled()). I would also put it to the beginning of
vmpressure or put an explanation why it makes sense only in this branch.
--
Michal Hocko
SUSE Labs
Michal Hocko <[email protected]> 于2021年9月15日周三 下午8:42写道:
>
> On Tue 14-09-21 09:05:51, [email protected] wrote:
> > From: wangyong <[email protected]>
> >
> > Use the global variable num_events to record the number of vmpressure
> > events registered by the system, and wake up work only when there is
> > registration event.
> > Usually, the vmpressure event is not registered in the system, this patch
> > can avoid waking up work and doing nothing.
>
> I have asked in the previous version and this changelog doesn't that
> explain again. Why don't you simply bail out early in vmpressure()
> entry?
Thanks for your reply.
Because the else branch will modify the socket_pressure, and will not wake up
the work. It is necessary to judge the tree parameters at the same
time, like this
if (tree && !static_branch_unlikely(&num_events))
return;
It's not good to judge the tree twice parameters in the function.
>
> > Test with 5.14.0-rc5-next-20210813 on x86_64 4G ram.
> > Consume cgroup memory until it is about to be reclaimed, then execute
> > "perf stat -I 2000 malloc.out" command to trigger memory reclamation
> > and get performance results.
> > The context-switches is reduced by about 20 times.
>
> Is this test somewhere available so that it can be reproduced by
> others. Also while the number of context switches can be an interesting
> it is not really clear from this evaluation whether that actually
> matters or not. E.g. what does an increase of task-clock and twice as
> many instructions recorded tell us?
>
The test program is a simple malloc process, which allocate memory
and fill some data.
I think it may be that more instructions can be executed per unit time.
> > unpatched:
> > Average of 10 test results
> > 582.4674048 task-clock(msec)
> > 19910.8 context-switches
> > 0 cpu-migrations
> > 1292.9 page-faults
> > 414784733.1 cycles
>
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
>
> Why is this a part of the data?
>
> > 580070698.4 instructions
> > 125572244.7 branches
> > 2073541.2 branch-misses
> >
> > patched
> > Average of 10 test results
> > 973.6174796 task-clock(msec)
> > 988.6 context-switches
> > 0 cpu-migrations
> > 1785.2 page-faults
> > 772883602.4 cycles
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 1360280911 instructions
> > 290519434.9 branches
> > 3378378.2 branch-misses
> >
> > Tested-by: Zeal Robot <[email protected]>
> > Signed-off-by: wangyong <[email protected]>
> > ---
> >
> [...]
> > @@ -272,6 +277,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
> > return;
> >
> > if (tree) {
> > + if (!static_branch_unlikely(&num_events))
> > + return;
>
> We usually hide the change behind a static inline helper (e.g.
> vmpressure_disabled()). I would also put it to the beginning of
> vmpressure or put an explanation why it makes sense only in this branch.
> --
Because only this branch needs to wake up work.
Yes, static inline helper is more easier to read and understand.