When mod_objcg_state() is called with a pgdat that is different from
that in the obj_stock, the old lruvec data cached in obj_stock are
flushed out. Unfortunately, they were flushed to the new pgdat and
hence the wrong node, not the one cached in obj_stock.
Fix that by flushing the data to the cached pgdat instead.
Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long <[email protected]>
---
mm/memcontrol.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ae1f5d0cb581..881ec4ddddcd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3106,17 +3106,19 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
stock->cached_pgdat = pgdat;
} else if (stock->cached_pgdat != pgdat) {
/* Flush the existing cached vmstat data */
+ struct pglist_data *oldpg = stock->cached_pgdat;
+
+ stock->cached_pgdat = pgdat;
if (stock->nr_slab_reclaimable_b) {
- mod_objcg_mlstate(objcg, pgdat, NR_SLAB_RECLAIMABLE_B,
+ mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B,
stock->nr_slab_reclaimable_b);
stock->nr_slab_reclaimable_b = 0;
}
if (stock->nr_slab_unreclaimable_b) {
- mod_objcg_mlstate(objcg, pgdat, NR_SLAB_UNRECLAIMABLE_B,
+ mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B,
stock->nr_slab_unreclaimable_b);
stock->nr_slab_unreclaimable_b = 0;
}
- stock->cached_pgdat = pgdat;
}
bytes = (idx == NR_SLAB_RECLAIMABLE_B) ? &stock->nr_slab_reclaimable_b
--
2.18.1
On Sun 01-08-21 22:28:27, Waiman Long wrote:
> When mod_objcg_state() is called with a pgdat that is different from
> that in the obj_stock, the old lruvec data cached in obj_stock are
> flushed out. Unfortunately, they were flushed to the new pgdat and
> hence the wrong node, not the one cached in obj_stock.
It would be great to explicitly mention user observable problems here. I
do assume this will make slab stats skewed but the effect wouldn't be
very big, right?
> Fix that by flushing the data to the cached pgdat instead.
>
> Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
> Signed-off-by: Waiman Long <[email protected]>
Acked-by: Michal Hocko <[email protected]>
> ---
> mm/memcontrol.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ae1f5d0cb581..881ec4ddddcd 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3106,17 +3106,19 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
> stock->cached_pgdat = pgdat;
> } else if (stock->cached_pgdat != pgdat) {
> /* Flush the existing cached vmstat data */
> + struct pglist_data *oldpg = stock->cached_pgdat;
> +
> + stock->cached_pgdat = pgdat;
> if (stock->nr_slab_reclaimable_b) {
> - mod_objcg_mlstate(objcg, pgdat, NR_SLAB_RECLAIMABLE_B,
> + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B,
> stock->nr_slab_reclaimable_b);
> stock->nr_slab_reclaimable_b = 0;
> }
> if (stock->nr_slab_unreclaimable_b) {
> - mod_objcg_mlstate(objcg, pgdat, NR_SLAB_UNRECLAIMABLE_B,
> + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B,
> stock->nr_slab_unreclaimable_b);
> stock->nr_slab_unreclaimable_b = 0;
> }
> - stock->cached_pgdat = pgdat;
Minor nit. Is there any reason to move the cached_pgdat? TBH I found the
original way better from the readability POV.
> }
>
> bytes = (idx == NR_SLAB_RECLAIMABLE_B) ? &stock->nr_slab_reclaimable_b
> --
> 2.18.1
--
Michal Hocko
SUSE Labs
On 8/2/21 2:28 AM, Michal Hocko wrote:
> On Sun 01-08-21 22:28:27, Waiman Long wrote:
>> When mod_objcg_state() is called with a pgdat that is different from
>> that in the obj_stock, the old lruvec data cached in obj_stock are
>> flushed out. Unfortunately, they were flushed to the new pgdat and
>> hence the wrong node, not the one cached in obj_stock.
> It would be great to explicitly mention user observable problems here. I
> do assume this will make slab stats skewed but the effect wouldn't be
> very big, right?
It is the /sys/devices/system/node/node*/meminfo that will get skewed.
Not /proc/meminfo. So it is a relatively minor issue. Will update the
patch to mention that.
>> Fix that by flushing the data to the cached pgdat instead.
>>
>> Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
>> Signed-off-by: Waiman Long <[email protected]>
> Acked-by: Michal Hocko <[email protected]>
>
>> ---
>> mm/memcontrol.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index ae1f5d0cb581..881ec4ddddcd 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -3106,17 +3106,19 @@ void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat,
>> stock->cached_pgdat = pgdat;
>> } else if (stock->cached_pgdat != pgdat) {
>> /* Flush the existing cached vmstat data */
>> + struct pglist_data *oldpg = stock->cached_pgdat;
>> +
>> + stock->cached_pgdat = pgdat;
>> if (stock->nr_slab_reclaimable_b) {
>> - mod_objcg_mlstate(objcg, pgdat, NR_SLAB_RECLAIMABLE_B,
>> + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B,
>> stock->nr_slab_reclaimable_b);
>> stock->nr_slab_reclaimable_b = 0;
>> }
>> if (stock->nr_slab_unreclaimable_b) {
>> - mod_objcg_mlstate(objcg, pgdat, NR_SLAB_UNRECLAIMABLE_B,
>> + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B,
>> stock->nr_slab_unreclaimable_b);
>> stock->nr_slab_unreclaimable_b = 0;
>> }
>> - stock->cached_pgdat = pgdat;
> Minor nit. Is there any reason to move the cached_pgdat? TBH I found the
> original way better from the readability POV.
Right. Will move it back to its original place.
Cheers,
Longman
On Sun, Aug 1, 2021 at 7:28 PM Waiman Long <[email protected]> wrote:
>
> When mod_objcg_state() is called with a pgdat that is different from
> that in the obj_stock, the old lruvec data cached in obj_stock are
> flushed out. Unfortunately, they were flushed to the new pgdat and
> hence the wrong node, not the one cached in obj_stock.
>
> Fix that by flushing the data to the cached pgdat instead.
>
> Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
> Signed-off-by: Waiman Long <[email protected]>
After incorporating Michal's comments, you can add:
Reviewed-by: Shakeel Butt <[email protected]>