This patch series helps to prevent load/store tearing in
several cgroup knobs.
As kindly pointed out by Michal Hocko and Roman Gushchin
, the changelog has been rephrased.
Besides, more knobs were checked, according to kind suggestions
from Shakeel Butt and Muchun Song.
v1:
- Add [WRITE|READ]_ONCE for memcg->oom_group
v2:
- Rephrase changelog
- Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
memcg->swappiness, vm_swappiness and memcg->soft_limit
Yue Zhao (4):
mm, memcg: Prevent memory.oom.group load/store tearing
mm, memcg: Prevent memory.swappiness load/store tearing
mm, memcg: Prevent memory.oom_control load/store tearing
mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing
include/linux/swap.h | 8 ++++----
mm/memcontrol.c | 18 +++++++++---------
2 files changed, 13 insertions(+), 13 deletions(-)
--
2.17.1
The knob for cgroup v2 memory controller: memory.oom.group
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely (the knob is
usually configured long before any workloads hits actual memcg oom)
but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
doing anything funky.
The access of memcg->oom_group is lockless, so it can be
concurrently set at the same time as we are trying to read it.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5abffe6f8389..06821e5f7604 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2067,7 +2067,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
* highest-level memory cgroup with oom.group set.
*/
for (; memcg; memcg = parent_mem_cgroup(memcg)) {
- if (memcg->oom_group)
+ if (READ_ONCE(memcg->oom_group))
oom_group = memcg;
if (memcg == oom_domain)
@@ -6623,7 +6623,7 @@ static int memory_oom_group_show(struct seq_file *m, void *v)
{
struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
- seq_printf(m, "%d\n", memcg->oom_group);
+ seq_printf(m, "%d\n", READ_ONCE(memcg->oom_group));
return 0;
}
@@ -6645,7 +6645,7 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of,
if (oom_group != 0 && oom_group != 1)
return -EINVAL;
- memcg->oom_group = oom_group;
+ WRITE_ONCE(memcg->oom_group, oom_group);
return nbytes;
}
--
2.17.1
The knob for cgroup v1 memory controller: memory.swappiness
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
doing anything funky.
The access of memcg->swappiness and vm_swappiness is lockless,
so both of them can be concurrently set at the same time
as we are trying to read them.
Signed-off-by: Yue Zhao <[email protected]>
---
include/linux/swap.h | 8 ++++----
mm/memcontrol.c | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 209a425739a9..3f3fe43d1766 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -620,18 +620,18 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
{
/* Cgroup2 doesn't have per-cgroup swappiness */
if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
/* root ? */
if (mem_cgroup_disabled() || mem_cgroup_is_root(memcg))
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
- return memcg->swappiness;
+ return READ_ONCE(memcg->swappiness);
}
#else
static inline int mem_cgroup_swappiness(struct mem_cgroup *mem)
{
- return vm_swappiness;
+ return READ_ONCE(vm_swappiness);
}
#endif
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 06821e5f7604..dca895c66a9b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4179,9 +4179,9 @@ static int mem_cgroup_swappiness_write(struct cgroup_subsys_state *css,
return -EINVAL;
if (!mem_cgroup_is_root(memcg))
- memcg->swappiness = val;
+ WRITE_ONCE(memcg->swappiness, val);
else
- vm_swappiness = val;
+ WRITE_ONCE(vm_swappiness, val);
return 0;
}
--
2.17.1
The knob for cgroup v1 memory controller: memory.oom_control
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
doing anything funky.
The access of memcg->oom_kill_disable is lockless,
so it can be concurrently set at the same time as we are
trying to read it.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dca895c66a9b..26605b2f51b1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4515,7 +4515,7 @@ static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
{
struct mem_cgroup *memcg = mem_cgroup_from_seq(sf);
- seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
+ seq_printf(sf, "oom_kill_disable %d\n", READ_ONCE(memcg->oom_kill_disable));
seq_printf(sf, "under_oom %d\n", (bool)memcg->under_oom);
seq_printf(sf, "oom_kill %lu\n",
atomic_long_read(&memcg->memory_events[MEMCG_OOM_KILL]));
@@ -4531,7 +4531,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
if (mem_cgroup_is_root(memcg) || !((val == 0) || (val == 1)))
return -EINVAL;
- memcg->oom_kill_disable = val;
+ WRITE_ONCE(memcg->oom_kill_disable, val);
if (!val)
memcg_oom_recover(memcg);
--
2.17.1
The knob for cgroup v1 memory controller: memory.soft_limit_in_bytes
is not protected by any locking so it can be modified while it is used.
This is not an actual problem because races are unlikely.
But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
doing anything funky.
The access of memcg->soft_limit is lockless,
so it can be concurrently set at the same time as we are
trying to read it.
Signed-off-by: Yue Zhao <[email protected]>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 26605b2f51b1..20566f59bbcb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3728,7 +3728,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
case RES_FAILCNT:
return counter->failcnt;
case RES_SOFT_LIMIT:
- return (u64)memcg->soft_limit * PAGE_SIZE;
+ return (u64)READ_ONCE(memcg->soft_limit) * PAGE_SIZE;
default:
BUG();
}
@@ -3870,7 +3870,7 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
ret = -EOPNOTSUPP;
} else {
- memcg->soft_limit = nr_pages;
+ WRITE_ONCE(memcg->soft_limit, nr_pages);
ret = 0;
}
break;
--
2.17.1
On Mon, Mar 6, 2023 at 7:42 AM Yue Zhao <[email protected]> wrote:
>
> This patch series helps to prevent load/store tearing in
> several cgroup knobs.
>
> As kindly pointed out by Michal Hocko and Roman Gushchin
> , the changelog has been rephrased.
>
> Besides, more knobs were checked, according to kind suggestions
> from Shakeel Butt and Muchun Song.
>
> v1:
> - Add [WRITE|READ]_ONCE for memcg->oom_group
> v2:
> - Rephrase changelog
> - Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
> memcg->swappiness, vm_swappiness and memcg->soft_limit
>
Thanks Yue and for the whole series:
Acked-by: Shakeel Butt <[email protected]>
On Mon 06-03-23 23:41:36, Yue Zhao wrote:
> The knob for cgroup v1 memory controller: memory.swappiness
> is not protected by any locking so it can be modified while it is used.
> This is not an actual problem because races are unlikely.
> But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> doing anything funky.
>
> The access of memcg->swappiness and vm_swappiness is lockless,
> so both of them can be concurrently set at the same time
> as we are trying to read them.
>
> Signed-off-by: Yue Zhao <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Thanks!
> ---
> include/linux/swap.h | 8 ++++----
> mm/memcontrol.c | 4 ++--
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 209a425739a9..3f3fe43d1766 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -620,18 +620,18 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg)
> {
> /* Cgroup2 doesn't have per-cgroup swappiness */
> if (cgroup_subsys_on_dfl(memory_cgrp_subsys))
> - return vm_swappiness;
> + return READ_ONCE(vm_swappiness);
>
> /* root ? */
> if (mem_cgroup_disabled() || mem_cgroup_is_root(memcg))
> - return vm_swappiness;
> + return READ_ONCE(vm_swappiness);
>
> - return memcg->swappiness;
> + return READ_ONCE(memcg->swappiness);
> }
> #else
> static inline int mem_cgroup_swappiness(struct mem_cgroup *mem)
> {
> - return vm_swappiness;
> + return READ_ONCE(vm_swappiness);
> }
> #endif
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 06821e5f7604..dca895c66a9b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4179,9 +4179,9 @@ static int mem_cgroup_swappiness_write(struct cgroup_subsys_state *css,
> return -EINVAL;
>
> if (!mem_cgroup_is_root(memcg))
> - memcg->swappiness = val;
> + WRITE_ONCE(memcg->swappiness, val);
> else
> - vm_swappiness = val;
> + WRITE_ONCE(vm_swappiness, val);
>
> return 0;
> }
> --
> 2.17.1
--
Michal Hocko
SUSE Labs
On Mon 06-03-23 23:41:35, Yue Zhao wrote:
> The knob for cgroup v2 memory controller: memory.oom.group
> is not protected by any locking so it can be modified while it is used.
> This is not an actual problem because races are unlikely (the knob is
> usually configured long before any workloads hits actual memcg oom)
> but it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> doing anything funky.
>
> The access of memcg->oom_group is lockless, so it can be
> concurrently set at the same time as we are trying to read it.
>
> Signed-off-by: Yue Zhao <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Thanks!
> ---
> mm/memcontrol.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5abffe6f8389..06821e5f7604 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2067,7 +2067,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
> * highest-level memory cgroup with oom.group set.
> */
> for (; memcg; memcg = parent_mem_cgroup(memcg)) {
> - if (memcg->oom_group)
> + if (READ_ONCE(memcg->oom_group))
> oom_group = memcg;
>
> if (memcg == oom_domain)
> @@ -6623,7 +6623,7 @@ static int memory_oom_group_show(struct seq_file *m, void *v)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_seq(m);
>
> - seq_printf(m, "%d\n", memcg->oom_group);
> + seq_printf(m, "%d\n", READ_ONCE(memcg->oom_group));
>
> return 0;
> }
> @@ -6645,7 +6645,7 @@ static ssize_t memory_oom_group_write(struct kernfs_open_file *of,
> if (oom_group != 0 && oom_group != 1)
> return -EINVAL;
>
> - memcg->oom_group = oom_group;
> + WRITE_ONCE(memcg->oom_group, oom_group);
>
> return nbytes;
> }
> --
> 2.17.1
--
Michal Hocko
SUSE Labs
On Mon, Mar 06, 2023 at 11:41:34PM +0800, Yue Zhao wrote:
> This patch series helps to prevent load/store tearing in
> several cgroup knobs.
>
> As kindly pointed out by Michal Hocko and Roman Gushchin
> , the changelog has been rephrased.
>
> Besides, more knobs were checked, according to kind suggestions
> from Shakeel Butt and Muchun Song.
>
> v1:
> - Add [WRITE|READ]_ONCE for memcg->oom_group
> v2:
> - Rephrase changelog
> - Add [WRITE|READ]_ONCE for memcg->oom_kill_disable,
> memcg->swappiness, vm_swappiness and memcg->soft_limit
>
> Yue Zhao (4):
> mm, memcg: Prevent memory.oom.group load/store tearing
> mm, memcg: Prevent memory.swappiness load/store tearing
> mm, memcg: Prevent memory.oom_control load/store tearing
> mm, memcg: Prevent memory.soft_limit_in_bytes load/store tearing
Acked-by: Roman Gushchin <[email protected]>
for the series.
Thank you!
On Mon 06-03-23 23:41:37, Yue Zhao wrote:
> The knob for cgroup v1 memory controller: memory.oom_control
> is not protected by any locking so it can be modified while it is used.
> This is not an actual problem because races are unlikely.
> But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> doing anything funky.
>
> The access of memcg->oom_kill_disable is lockless,
> so it can be concurrently set at the same time as we are
> trying to read it.
>
> Signed-off-by: Yue Zhao <[email protected]>
> ---
> mm/memcontrol.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index dca895c66a9b..26605b2f51b1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4515,7 +4515,7 @@ static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_seq(sf);
>
> - seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
> + seq_printf(sf, "oom_kill_disable %d\n", READ_ONCE(memcg->oom_kill_disable));
> seq_printf(sf, "under_oom %d\n", (bool)memcg->under_oom);
> seq_printf(sf, "oom_kill %lu\n",
> atomic_long_read(&memcg->memory_events[MEMCG_OOM_KILL]));
> @@ -4531,7 +4531,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
> if (mem_cgroup_is_root(memcg) || !((val == 0) || (val == 1)))
> return -EINVAL;
>
> - memcg->oom_kill_disable = val;
> + WRITE_ONCE(memcg->oom_kill_disable, val);
> if (!val)
> memcg_oom_recover(memcg);
Any specific reasons you haven't covered other accesses
(mem_cgroup_css_alloc, mem_cgroup_oom, mem_cgroup_oom_synchronize)?
>
> --
> 2.17.1
--
Michal Hocko
SUSE Labs
On Mon 06-03-23 23:41:38, Yue Zhao wrote:
> The knob for cgroup v1 memory controller: memory.soft_limit_in_bytes
> is not protected by any locking so it can be modified while it is used.
> This is not an actual problem because races are unlikely.
> But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> doing anything funky.
>
> The access of memcg->soft_limit is lockless,
> so it can be concurrently set at the same time as we are
> trying to read it.
Similar here. mem_cgroup_css_reset and mem_cgroup_css_alloc are not
covered.
>
> Signed-off-by: Yue Zhao <[email protected]>
> ---
> mm/memcontrol.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 26605b2f51b1..20566f59bbcb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3728,7 +3728,7 @@ static u64 mem_cgroup_read_u64(struct cgroup_subsys_state *css,
> case RES_FAILCNT:
> return counter->failcnt;
> case RES_SOFT_LIMIT:
> - return (u64)memcg->soft_limit * PAGE_SIZE;
> + return (u64)READ_ONCE(memcg->soft_limit) * PAGE_SIZE;
> default:
> BUG();
> }
> @@ -3870,7 +3870,7 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of,
> if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> ret = -EOPNOTSUPP;
> } else {
> - memcg->soft_limit = nr_pages;
> + WRITE_ONCE(memcg->soft_limit, nr_pages);
> ret = 0;
> }
> break;
> --
> 2.17.1
--
Michal Hocko
SUSE Labs
On Tue, Mar 7, 2023 at 1:53 AM Michal Hocko <[email protected]> wrote:
>
> On Mon 06-03-23 23:41:37, Yue Zhao wrote:
> > The knob for cgroup v1 memory controller: memory.oom_control
> > is not protected by any locking so it can be modified while it is used.
> > This is not an actual problem because races are unlikely.
> > But it is better to use READ_ONCE/WRITE_ONCE to prevent compiler from
> > doing anything funky.
> >
> > The access of memcg->oom_kill_disable is lockless,
> > so it can be concurrently set at the same time as we are
> > trying to read it.
> >
> > Signed-off-by: Yue Zhao <[email protected]>
> > ---
> > mm/memcontrol.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index dca895c66a9b..26605b2f51b1 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -4515,7 +4515,7 @@ static int mem_cgroup_oom_control_read(struct seq_file *sf, void *v)
> > {
> > struct mem_cgroup *memcg = mem_cgroup_from_seq(sf);
> >
> > - seq_printf(sf, "oom_kill_disable %d\n", memcg->oom_kill_disable);
> > + seq_printf(sf, "oom_kill_disable %d\n", READ_ONCE(memcg->oom_kill_disable));
> > seq_printf(sf, "under_oom %d\n", (bool)memcg->under_oom);
> > seq_printf(sf, "oom_kill %lu\n",
> > atomic_long_read(&memcg->memory_events[MEMCG_OOM_KILL]));
> > @@ -4531,7 +4531,7 @@ static int mem_cgroup_oom_control_write(struct cgroup_subsys_state *css,
> > if (mem_cgroup_is_root(memcg) || !((val == 0) || (val == 1)))
> > return -EINVAL;
> >
> > - memcg->oom_kill_disable = val;
> > + WRITE_ONCE(memcg->oom_kill_disable, val);
> > if (!val)
> > memcg_oom_recover(memcg);
>
> Any specific reasons you haven't covered other accesses
> (mem_cgroup_css_alloc, mem_cgroup_oom, mem_cgroup_oom_synchronize)?
Thanks for point this out, you are right, we should add
[READ|WRITE]_ONCE for all used places.
Let me create PATCH v3 later.
Also for the memcg->soft_limit, I will update as well.
> >
> > --
> > 2.17.1
>
> --
> Michal Hocko
> SUSE Labs