This is a new solution to charge ZRAM objects,more simple than
previous one[1],The compressed RAM is currently charged to
kernel,not to any memory cgroup.
As we know, zram can be used in two ways, direct and
indirect, this patchset can charge memory in both cases.
Direct zram usage by process within a cgroup will fail
to charge if there is no memory. Indirect zram usage by
process within a cgroup via swap in PF_MEMALLOC context,
will charge successfully.
[1]
https://lore.kernel.org/all/[email protected]/
Zhongkun He (2):
memcg: Add support for zram object charge
zram: charge the compressed RAM to the page's memcgroup
drivers/block/zram/zram_drv.c | 43 +++++++++++++++++++++++++++++++++++
drivers/block/zram/zram_drv.h | 1 +
include/linux/memcontrol.h | 10 ++++++++
mm/memcontrol.c | 23 +++++++++++++++++++
4 files changed, 77 insertions(+)
--
2.25.1
On Fri 07-07-23 12:46:13, Zhongkun He wrote:
> This is a new solution to charge ZRAM objects,more simple than
> previous one[1],The compressed RAM is currently charged to
> kernel,not to any memory cgroup.
>
> As we know, zram can be used in two ways, direct and
> indirect, this patchset can charge memory in both cases.
> Direct zram usage by process within a cgroup will fail
> to charge if there is no memory. Indirect zram usage by
> process within a cgroup via swap in PF_MEMALLOC context,
> will charge successfully.
Please state the objective you are trying to achieve by this patchset.
It is always good to summarize the previous discussion and mention what
is done differently or how previous review feedback has been addressed
but the overall idea/purpose should be always explicit.
Please elaborate more about both.
> [1]
> https://lore.kernel.org/all/[email protected]/
>
> Zhongkun He (2):
> memcg: Add support for zram object charge
> zram: charge the compressed RAM to the page's memcgroup
>
> drivers/block/zram/zram_drv.c | 43 +++++++++++++++++++++++++++++++++++
> drivers/block/zram/zram_drv.h | 1 +
> include/linux/memcontrol.h | 10 ++++++++
> mm/memcontrol.c | 23 +++++++++++++++++++
> 4 files changed, 77 insertions(+)
>
> --
> 2.25.1
--
Michal Hocko
SUSE Labs
> Please state the objective you are trying to achieve by this patchset.
> It is always good to summarize the previous discussion and mention what
> is done differently or how previous review feedback has been addressed
> but the overall idea/purpose should be always explicit.
>
> Please elaborate more about both.
>
Thanks for your reply.
objective:
the compressed memory of zram charge to the cgroup of the user.
summarize the previous discussion:
[1] As I can see, Michal's concern is that the charges are going to fail
and swapout would fail.
The indirect use of zram is in the context of PF_MEMALLOC, so
the charge must be successful.
[2] David's concern is that if there is a page in the BIO that is not charged,
we can not charge the compressed page for the fs->zram and whether
the recompress case is charged.
In the new solution, the recompress case can be charged. But if there is
a BIO not charged, I'm not so sure, to be honest.it would be great if
someone with more FS->BIO experience could comment.
I have to review the relevant code to check it.
[3] Yosry's concern is that the previous patchsets are very complicated,
and do not have wirteback support in zram.
For the new patchset, charging becomes very simple, and zram supports
writing back to disk.At the same time, I am hoping the use of zswap
without a backing swapfile can be enabled.
To summarize the new patchset:
We charge compressed memory directly in the zram module instead of in
the zsmalloc module because zsmallc may be used by zswap, zswap objects
has been charged once in the zswap module, so zsmallc will double charge.
On Fri 07-07-23 22:25:48, 贺中坤 wrote:
> > Please state the objective you are trying to achieve by this patchset.
> > It is always good to summarize the previous discussion and mention what
> > is done differently or how previous review feedback has been addressed
> > but the overall idea/purpose should be always explicit.
> >
> > Please elaborate more about both.
> >
>
> Thanks for your reply.
> objective:
> the compressed memory of zram charge to the cgroup of the user.
Why do we want/need that?
> summarize the previous discussion:
> [1] As I can see, Michal's concern is that the charges are going to fail
> and swapout would fail.
>
> The indirect use of zram is in the context of PF_MEMALLOC, so
> the charge must be successful.
No, this was not my concern. Please read through that more carefully. My
concern was that the hard limit reclaim would fail. PF_MEMALLOC will not
help in that case as this is not a global reclaim path.
Also let's assume you allow swapout charges to succeed similar to
PF_MEMALLOC. That would mean breaching the limit in an unbounded way,
no?
--
Michal Hocko
SUSE Labs
On Fri, Jul 7, 2023 at 10:44 PM Michal Hocko <[email protected]> wrote:
>
> Why do we want/need that?
Applications can currently escape their cgroup memory containment when
zram is enabled regardless of indirect(swapfile) or direct usage(disk) .
This patch adds memcg accounting to fix it.
Zram and zswap have the same problem,please refer to the patch
corresponding to zswap[1].
[1] https://lore.kernel.org/all/[email protected]/
>
> > summarize the previous discussion:
> > [1] As I can see, Michal's concern is that the charges are going to fail
> > and swapout would fail.
> >
> > The indirect use of zram is in the context of PF_MEMALLOC, so
> > the charge must be successful.
>
> No, this was not my concern. Please read through that more carefully. My
> concern was that the hard limit reclaim would fail. PF_MEMALLOC will not
> help in that case as this is not a global reclaim path.
>
Sorry for my expression. I mean the hard limit reclaim would fail.
As i can see, the PF_MEMALLOC is not only used in global reclaim path
but the mem_cgroup reclaim.
try_charge_memcg
try_to_free_mem_cgroup_pages
noreclaim_flag = memalloc_noreclaim_save();
nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
memalloc_noreclaim_restore(noreclaim_flag);
> Also let's assume you allow swapout charges to succeed similar to
> PF_MEMALLOC. That would mean breaching the limit in an unbounded way,
> no?
> --
Chage compressed page once, mean a page will be freed. the size of compressed
page is less than or equal to the page to be freed. So not an unbounded way.
> Michal Hocko
> SUSE Labs
On Mon 10-07-23 17:35:07, 贺中坤 wrote:
> On Fri, Jul 7, 2023 at 10:44 PM Michal Hocko <[email protected]> wrote:
> >
> > Why do we want/need that?
>
> Applications can currently escape their cgroup memory containment when
> zram is enabled regardless of indirect(swapfile) or direct usage(disk) .
> This patch adds memcg accounting to fix it.
>
> Zram and zswap have the same problem,please refer to the patch
> corresponding to zswap[1].
>
> [1] https://lore.kernel.org/all/[email protected]/
>
> >
> > > summarize the previous discussion:
> > > [1] As I can see, Michal's concern is that the charges are going to fail
> > > and swapout would fail.
> > >
> > > The indirect use of zram is in the context of PF_MEMALLOC, so
> > > the charge must be successful.
> >
> > No, this was not my concern. Please read through that more carefully. My
> > concern was that the hard limit reclaim would fail. PF_MEMALLOC will not
> > help in that case as this is not a global reclaim path.
> >
>
> Sorry for my expression. I mean the hard limit reclaim would fail.
> As i can see, the PF_MEMALLOC is not only used in global reclaim path
> but the mem_cgroup reclaim.
>
> try_charge_memcg
> try_to_free_mem_cgroup_pages
> noreclaim_flag = memalloc_noreclaim_save();
> nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
> memalloc_noreclaim_restore(noreclaim_flag);
My bad, I have overlooked this. I forgot about 89a2848381b5f.
> > Also let's assume you allow swapout charges to succeed similar to
> > PF_MEMALLOC. That would mean breaching the limit in an unbounded way,
> > no?
> > --
>
> Chage compressed page once, mean a page will be freed. the size of compressed
> page is less than or equal to the page to be freed. So not an unbounded way.
OK, this is an important detail to mention. Also have tried to get some
numbers of how much excess is happening for a mixed bag of compressible
memory?
--
Michal Hocko
SUSE Labs
>
> OK, this is an important detail to mention. Also have tried to get some
> numbers of how much excess is happening for a mixed bag of compressible
> memory?
Yes, I've done the test many times. The numbers of excess depend on the
compression ratio only. The maximum amount will not exceed 400KB,and will be
smaller than the hard limit finally.
> --
> Michal Hocko
> SUSE Labs
On Mon 10-07-23 21:16:42, 贺中坤 wrote:
> >
> > OK, this is an important detail to mention. Also have tried to get some
> > numbers of how much excess is happening for a mixed bag of compressible
> > memory?
>
> Yes, I've done the test many times. The numbers of excess depend on the
> compression ratio only. The maximum amount will not exceed 400KB,and will be
> smaller than the hard limit finally.
Again, important data point for your changelog.
--
Michal Hocko
SUSE Labs
On Mon, Jul 10, 2023 at 9:35 PM Michal Hocko <[email protected]> wrote:
>
> On Mon 10-07-23 21:16:42, 贺中坤 wrote:
> > >
> > > OK, this is an important detail to mention. Also have tried to get some
> > > numbers of how much excess is happening for a mixed bag of compressible
> > > memory?
> >
> > Yes, I've done the test many times. The numbers of excess depend on the
> > compression ratio only. The maximum amount will not exceed 400KB,and will be
> > smaller than the hard limit finally.
>
> Again, important data point for your changelog.
Thanks, more data will be added in the next version.
>
> --
> Michal Hocko
> SUSE Labs