This patch series helps to prevent load/store tearing in
several cgroup knobs.
As kindly pointed out by Michal Hocko, we should add
[WRITE|READ]_ONCE for all occurrences of memcg->oom_kill_disable,
memcg->swappiness and memcg->soft_limit.
v3:
- Add [WRITE|READ]_ONCE for all occurrences of
memcg->oom_kill_disable, memcg->swappiness and memcg->soft_limit
v2:
- Rephrase changelog
- Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
memcg->swappiness, vm_swappiness and memcg->soft_limit
v1:
- Add [WRITE|READ]_ONCE for memcg->oom_group
Past patches:
V2: https://lore.kernel.org/linux-mm/[email protected]/
V1: https://lore.kernel.org/linux-mm/[email protected]/
Yue Zhao (4):
mm, memcg: Prevent memory.oom.group load/store tearing
mm, memcg: Prevent memory.swappiness load/store tearing
mm, memcg: Prevent memory.oom_control load/store tearing
mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing
include/linux/swap.h | 8 ++++----
mm/memcontrol.c | 30 +++++++++++++++---------------
2 files changed, 19 insertions(+), 19 deletions(-)
--
2.17.1
The knob for cgroup v2 memory controller: memory.oom.group
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely (the knob is
usually configured long before any workloads hits actual memcg oom)
but it is better to use [READ|WRITE]_ONCE to prevent compiler from
doing anything funky.
The access of memcg->oom_group is lockless, so it can be
concurrently set at the same time as we are trying to read it.
All occurrences of memcg->oom_group are updated with [READ|WRITE]_ONCE.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5abffe6f8389..06821e5f7604 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2067,7 +2067,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
* highest-level memory cgroup with oom.group set.
*/
for (; memcg; memcg = parent_mem_cgroup(memcg)) {
- if (memcg->oom_group)
+ if (READ_ONCE(memcg->oom_group))
oom_group = memcg;
if (memcg == oom_domain)
@@ -6623,7 +6623,7 @@ static int memory_oom_group_show(struct seq_file *m, void *v)
{
struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
- seq_printf(m, "%d\n", memcg->oom_group);
+ seq_printf(m, "%d\n", READ_ONCE(memcg->oom_group));
return 0;
}
@@ -6645,7 +6645,7 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of,
if (oom_group != 0 && oom_group != 1)
return -EINVAL;
- memcg->oom_group = oom_group;
+ WRITE_ONCE(memcg->oom_group, oom_group);
return nbytes;
}
--
2.17.1
The knob for cgroup v1 memory controller: memory.swappiness
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use [READ|WRITE]_ONCE to prevent compiler from
doing anything funky.
The access of memcg->swappiness and vm_swappiness is lockless,
so both of them can be concurrently set at the same time
as we are trying to read them. All occurrences of memcg->swappiness
and vm_swappiness are updated with [READ|WRITE]_ONCE.
Signed-off-by: Yue Zhao <[email protected]>
---
include/linux/swap.h | 8 ++++----
mm/memcontrol.c | 6 +++---
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 209a425739a9..3f3fe43d1766 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -620,18 +620,18 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
{
/* Cgroup2 doesn't have per-cgroup swappiness */
if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
/* root ? */
if (mem_cgroup_disabled() || mem_cgroup_is_root(memcg))
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
- return memcg->swappiness;
+ return READ_ONCE(memcg->swappiness);
}
#else
static inline int mem_cgroup_swappiness(struct mem_cgroup *mem)
{
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
}
#endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 06821e5f7604..1b0112afcad3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4179,9 +4179,9 @@ static int mem_cgroup_swappiness_write(struct cgroup_subsys_state *css,
return -EINVAL;
if (!mem_cgroup_is_root(memcg))
- memcg->swappiness = val;
+ WRITE_ONCE(memcg->swappiness, val);
else
- vm_swappiness = val;
+ WRITE_ONCE(vm_swappiness, val);
return 0;
}
@@ -5353,7 +5353,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
#endif
page_counter_set_high(&memcg->swap, PAGE_COUNTER_MAX);
if (parent) {
- memcg->swappiness = mem_cgroup_swappiness(parent);
+ WRITE_ONCE(memcg->swappiness, mem_cgroup_swappiness(parent));
memcg->oom_kill_disable = parent->oom_kill_disable;
page_counter_init(&memcg->memory, &parent->memory);
--
2.17.1
The knob for cgroup v1 memory controller: memory.oom_control
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use [READ|WRITE]_ONCE to prevent compiler from
doing anything funky.
The access of memcg->oom_kill_disable is lockless,
so it can be concurrently set at the same time as we are
trying to read it. All occurrences of memcg->oom_kill_disable
are updated with [READ|WRITE]_ONCE.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1b0112afcad3..5b7062d0f5e0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1929,7 +1929,7 @@ static bool mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
* Please note that mem_cgroup_out_of_memory might fail to find a
* victim and then we have to bail out from the charge path.
*/
- if (memcg->oom_kill_disable) {
+ if (READ_ONCE(memcg->oom_kill_disable)) {
if (current->in_user_fault) {
css_get(&memcg->css);
current->memcg_in_oom = memcg;
@@ -1999,7 +1999,7 @@ bool mem_cgroup_oom_synchronize(bool handle)
if (locked)
mem_cgroup_oom_notify(memcg);
- if (locked && !memcg->oom_kill_disable) {
+ if (locked && !READ_ONCE(memcg->oom_kill_disable)) {
mem_cgroup_unmark_under_oom(memcg);
finish_wait(&memcg_oom_waitq, &owait.wait);
mem_cgroup_out_of_memory(memcg, current->memcg_oom_gfp_mask,
@@ -4515,7 +4515,7 @@ static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
{
struct mem_cgroup *memcg = mem_cgroup_from_seq(sf);
- seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
+ seq_printf(sf, "oom_kill_disable %d\n", READ_ONCE(memcg->oom_kill_disable));
seq_printf(sf, "under_oom %d\n", (bool)memcg->under_oom);
seq_printf(sf, "oom_kill %lu\n",
atomic_long_read(&memcg->memory_events[MEMCG_OOM_KILL]));
@@ -4531,7 +4531,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
if (mem_cgroup_is_root(memcg) || !((val == 0) || (val == 1)))
return -EINVAL;
- memcg->oom_kill_disable = val;
+ WRITE_ONCE(memcg->oom_kill_disable, val);
if (!val)
memcg_oom_recover(memcg);
@@ -5354,7 +5354,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
page_counter_set_high(&memcg->swap, PAGE_COUNTER_MAX);
if (parent) {
WRITE_ONCE(memcg->swappiness, mem_cgroup_swappiness(parent));
- memcg->oom_kill_disable = parent->oom_kill_disable;
+ WRITE_ONCE(memcg->oom_kill_disable, READ_ONCE(parent->oom_kill_disable));
page_counter_init(&memcg->memory, &parent->memory);
page_counter_init(&memcg->swap, &parent->swap);
--
2.17.1
The knob for cgroup v1 memory controller: memory.soft_limit_in_bytes
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use [READ|WRITE]_ONCE to prevent compiler from
doing anything funky.
The access of memcg->soft_limit is lockless,
so it can be concurrently set at the same time as we are
trying to read it. All occurrences of memcg->soft_limit
are updated with [READ|WRITE]_ONCE.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5b7062d0f5e0..13ec89c45389 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3728,7 +3728,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
case RES_FAILCNT:
return counter->failcnt;
case RES_SOFT_LIMIT:
- return (u64)memcg->soft_limit * PAGE_SIZE;
+ return (u64)READ_ONCE(memcg->soft_limit) * PAGE_SIZE;
default:
BUG();
}
@@ -3870,7 +3870,7 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
ret = -EOPNOTSUPP;
} else {
- memcg->soft_limit = nr_pages;
+ WRITE_ONCE(memcg->soft_limit, nr_pages);
ret = 0;
}
break;
@@ -5347,7 +5347,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
return ERR_CAST(memcg);
page_counter_set_high(&memcg->memory, PAGE_COUNTER_MAX);
- memcg->soft_limit = PAGE_COUNTER_MAX;
+ WRITE_ONCE(memcg->soft_limit, PAGE_COUNTER_MAX);
#if defined(CONFIG_MEMCG_KMEM) && defined(CONFIG_ZSWAP)
memcg->zswap_max = PAGE_COUNTER_MAX;
#endif
@@ -5502,7 +5502,7 @@ static void mem_cgroup_css_reset(struct cgroup_subsys_state *css)
page_counter_set_min(&memcg->memory, 0);
page_counter_set_low(&memcg->memory, 0);
page_counter_set_high(&memcg->memory, PAGE_COUNTER_MAX);
- memcg->soft_limit = PAGE_COUNTER_MAX;
+ WRITE_ONCE(memcg->soft_limit, PAGE_COUNTER_MAX);
page_counter_set_high(&memcg->swap, PAGE_COUNTER_MAX);
memcg_wb_domain_size_changed(memcg);
}
--
2.17.1
On Thu 09-03-23 00:25:51, Yue Zhao wrote:
> This patch series helps to prevent load/store tearing in
> several cgroup knobs.
>
> As kindly pointed out by Michal Hocko, we should add
> [WRITE|READ]_ONCE for all occurrences of memcg->oom_kill_disable,
> memcg->swappiness and memcg->soft_limit.
>
> v3:
> - Add [WRITE|READ]_ONCE for all occurrences of
> memcg->oom_kill_disable, memcg->swappiness and memcg->soft_limit
> v2:
> - Rephrase changelog
> - Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
> memcg->swappiness, vm_swappiness and memcg->soft_limit
> v1:
> - Add [WRITE|READ]_ONCE for memcg->oom_group
>
> Past patches:
> V2: https://lore.kernel.org/linux-mm/[email protected]/
> V1: https://lore.kernel.org/linux-mm/[email protected]/
>
> Yue Zhao (4):
> mm, memcg: Prevent memory.oom.group load/store tearing
> mm, memcg: Prevent memory.swappiness load/store tearing
> mm, memcg: Prevent memory.oom_control load/store tearing
> mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing
>
> include/linux/swap.h | 8 ++++----
> mm/memcontrol.c | 30 +++++++++++++++---------------
> 2 files changed, 19 insertions(+), 19 deletions(-)
Acked-by: Michal Hocko <[email protected]>
Btw. you could have preserved acks for patches you haven't changed from
the previous version.
--
Michal Hocko
SUSE Labs
On Wed, Mar 8, 2023 at 9:40 AM Michal Hocko <[email protected]> wrote:
>
> On Thu 09-03-23 00:25:51, Yue Zhao wrote:
> > This patch series helps to prevent load/store tearing in
> > several cgroup knobs.
> >
> > As kindly pointed out by Michal Hocko, we should add
> > [WRITE|READ]_ONCE for all occurrences of memcg->oom_kill_disable,
> > memcg->swappiness and memcg->soft_limit.
> >
> > v3:
> > - Add [WRITE|READ]_ONCE for all occurrences of
> > memcg->oom_kill_disable, memcg->swappiness and memcg->soft_limit
> > v2:
> > - Rephrase changelog
> > - Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
> > memcg->swappiness, vm_swappiness and memcg->soft_limit
> > v1:
> > - Add [WRITE|READ]_ONCE for memcg->oom_group
> >
> > Past patches:
> > V2: https://lore.kernel.org/linux-mm/[email protected]/
> > V1: https://lore.kernel.org/linux-mm/[email protected]/
> >
> > Yue Zhao (4):
> > mm, memcg: Prevent memory.oom.group load/store tearing
> > mm, memcg: Prevent memory.swappiness load/store tearing
> > mm, memcg: Prevent memory.oom_control load/store tearing
> > mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing
> >
> > include/linux/swap.h | 8 ++++----
> > mm/memcontrol.c | 30 +++++++++++++++---------------
> > 2 files changed, 19 insertions(+), 19 deletions(-)
>
> Acked-by: Michal Hocko <[email protected]>
>
> Btw. you could have preserved acks for patches you haven't changed from
> the previous version.
>
For whole series:
Acked-by: Shakeel Butt <[email protected]>