Kairui Song <[email protected]> writes:
> From: Kairui Song <[email protected]>
>
> Currently, mem_cgroup_swapin_charge_folio is always called with
> mm argument as NULL, except in swapin_direct.
>
> swapin_direct is used when swapin should skip readahead and
> swapcache (SWP_SYNCHRONOUS_IO). Other caller paths of
> mem_cgroup_swapin_charge_folio are for swapin that should
> not skip readahead and cache.
>
> This could cause swapin charging to behave differently depending
> on swap device. This currently didn't happen because the only call
> path of swapin_direct is the direct anon page fault path, where mm
> equals to current->mm, but will no longer be true if swapin_direct
> is shared and have other callers (eg, swapoff).
>
> So make swapin_direct also passes NULL for mm, no feature change.
>
> Signed-off-by: Kairui Song <[email protected]>
> ---
> mm/swap_state.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index 6130de8d5226..d39c5369da21 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -881,7 +881,7 @@ struct folio *swapin_direct(swp_entry_t entry, gfp_t gfp_mask,
> folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
> vma, vmf->address, false);
> if (folio) {
> - if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
> + if (mem_cgroup_swapin_charge_folio(folio, NULL,
> GFP_KERNEL, entry)) {
> folio_put(folio);
> return NULL;
I think that why not provide "mm" when it's available? For
swapin_direct() called by do_swap_page(), mm can be provided. While,
for swapin_direct() called by shmem swapin, mm will be NULL. We can
even provide "mm" for __read_swap_cache_async() for VMA based swapin and
for the fault address for cluster swapin.
--
Best Regards,
Huang, Ying
Huang, Ying <[email protected]> 于2024年1月5日周五 15:16写道:
>
> Kairui Song <[email protected]> writes:
>
> > From: Kairui Song <[email protected]>
> >
> > Currently, mem_cgroup_swapin_charge_folio is always called with
> > mm argument as NULL, except in swapin_direct.
> >
> > swapin_direct is used when swapin should skip readahead and
> > swapcache (SWP_SYNCHRONOUS_IO). Other caller paths of
> > mem_cgroup_swapin_charge_folio are for swapin that should
> > not skip readahead and cache.
> >
> > This could cause swapin charging to behave differently depending
> > on swap device. This currently didn't happen because the only call
> > path of swapin_direct is the direct anon page fault path, where mm
> > equals to current->mm, but will no longer be true if swapin_direct
> > is shared and have other callers (eg, swapoff).
> >
> > So make swapin_direct also passes NULL for mm, no feature change.
> >
> > Signed-off-by: Kairui Song <[email protected]>
> > ---
> > mm/swap_state.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/swap_state.c b/mm/swap_state.c
> > index 6130de8d5226..d39c5369da21 100644
> > --- a/mm/swap_state.c
> > +++ b/mm/swap_state.c
> > @@ -881,7 +881,7 @@ struct folio *swapin_direct(swp_entry_t entry, gfp_t gfp_mask,
> > folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
> > vma, vmf->address, false);
> > if (folio) {
> > - if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
> > + if (mem_cgroup_swapin_charge_folio(folio, NULL,
> > GFP_KERNEL, entry)) {
> > folio_put(folio);
> > return NULL;
>
> I think that why not provide "mm" when it's available? For
> swapin_direct() called by do_swap_page(), mm can be provided. While,
> for swapin_direct() called by shmem swapin, mm will be NULL. We can
> even provide "mm" for __read_swap_cache_async() for VMA based swapin and
> for the fault address for cluster swapin.
Hi, Ying.
Thanks for the comment.
Without modifying too much code, providing mm here will change swapin
charge behaviour on swapoff, we discussed it previously:
https://lkml.org/lkml/2023/11/19/320
If we want to avoid the behavior change, we have to extend all direct
and indirect callers of mem_cgroup_swapin_charge_folio to accept a
standalone mm argument (including swapin_direct, swap_vma_readahead,
swap_cluster_readahead, __read_swap_cache_async, swapin_entry_mpol,
read_swap_cache_async, and some other path may need more audit), and
sort things our case by case. I'm not sure if that's a good idea...
Simply dropping it here seems the easiest way to avoid such change.
Kairui Song <[email protected]> writes:
> Huang, Ying <[email protected]> 于2024年1月5日周五 15:16写道:
>>
>> Kairui Song <[email protected]> writes:
>>
>> > From: Kairui Song <[email protected]>
>> >
>> > Currently, mem_cgroup_swapin_charge_folio is always called with
>> > mm argument as NULL, except in swapin_direct.
>> >
>> > swapin_direct is used when swapin should skip readahead and
>> > swapcache (SWP_SYNCHRONOUS_IO). Other caller paths of
>> > mem_cgroup_swapin_charge_folio are for swapin that should
>> > not skip readahead and cache.
>> >
>> > This could cause swapin charging to behave differently depending
>> > on swap device. This currently didn't happen because the only call
>> > path of swapin_direct is the direct anon page fault path, where mm
>> > equals to current->mm, but will no longer be true if swapin_direct
>> > is shared and have other callers (eg, swapoff).
>> >
>> > So make swapin_direct also passes NULL for mm, no feature change.
>> >
>> > Signed-off-by: Kairui Song <[email protected]>
>> > ---
>> > mm/swap_state.c | 2 +-
>> > 1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/mm/swap_state.c b/mm/swap_state.c
>> > index 6130de8d5226..d39c5369da21 100644
>> > --- a/mm/swap_state.c
>> > +++ b/mm/swap_state.c
>> > @@ -881,7 +881,7 @@ struct folio *swapin_direct(swp_entry_t entry, gfp_t gfp_mask,
>> > folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
>> > vma, vmf->address, false);
>> > if (folio) {
>> > - if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
>> > + if (mem_cgroup_swapin_charge_folio(folio, NULL,
>> > GFP_KERNEL, entry)) {
>> > folio_put(folio);
>> > return NULL;
>>
>> I think that why not provide "mm" when it's available? For
>> swapin_direct() called by do_swap_page(), mm can be provided. While,
>> for swapin_direct() called by shmem swapin, mm will be NULL. We can
>> even provide "mm" for __read_swap_cache_async() for VMA based swapin and
>> for the fault address for cluster swapin.
>
> Hi, Ying.
>
> Thanks for the comment.
>
> Without modifying too much code, providing mm here will change swapin
> charge behaviour on swapoff, we discussed it previously:
> https://lkml.org/lkml/2023/11/19/320
It's better to use "lore" for kernel email link, for example,
https://lore.kernel.org/lkml/[email protected]/
This is more readable than the above link.
> If we want to avoid the behavior change, we have to extend all direct
> and indirect callers of mem_cgroup_swapin_charge_folio to accept a
> standalone mm argument (including swapin_direct, swap_vma_readahead,
> swap_cluster_readahead, __read_swap_cache_async, swapin_entry_mpol,
> read_swap_cache_async, and some other path may need more audit), and
> sort things our case by case. I'm not sure if that's a good idea...
>
> Simply dropping it here seems the easiest way to avoid such change.
OK. Didn't realize that they are related directly. It appears that we
always pass NULL as mm to mem_cgroup_swapin_charge_folio(). If so,
please just remove the parameter.
--
Best Regards,
Huang, Ying
Huang, Ying <[email protected]> 于2024年1月8日周一 15:46写道:
>
> Kairui Song <[email protected]> writes:
>
> > Huang, Ying <[email protected]> 于2024年1月5日周五 15:16写道:
> >>
> >> Kairui Song <[email protected]> writes:
> >>
> >> > From: Kairui Song <[email protected]>
> >> >
> >> > Currently, mem_cgroup_swapin_charge_folio is always called with
> >> > mm argument as NULL, except in swapin_direct.
> >> >
> >> > swapin_direct is used when swapin should skip readahead and
> >> > swapcache (SWP_SYNCHRONOUS_IO). Other caller paths of
> >> > mem_cgroup_swapin_charge_folio are for swapin that should
> >> > not skip readahead and cache.
> >> >
> >> > This could cause swapin charging to behave differently depending
> >> > on swap device. This currently didn't happen because the only call
> >> > path of swapin_direct is the direct anon page fault path, where mm
> >> > equals to current->mm, but will no longer be true if swapin_direct
> >> > is shared and have other callers (eg, swapoff).
> >> >
> >> > So make swapin_direct also passes NULL for mm, no feature change.
> >> >
> >> > Signed-off-by: Kairui Song <[email protected]>
> >> > ---
> >> > mm/swap_state.c | 2 +-
> >> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/mm/swap_state.c b/mm/swap_state.c
> >> > index 6130de8d5226..d39c5369da21 100644
> >> > --- a/mm/swap_state.c
> >> > +++ b/mm/swap_state.c
> >> > @@ -881,7 +881,7 @@ struct folio *swapin_direct(swp_entry_t entry, gfp_t gfp_mask,
> >> > folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0,
> >> > vma, vmf->address, false);
> >> > if (folio) {
> >> > - if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm,
> >> > + if (mem_cgroup_swapin_charge_folio(folio, NULL,
> >> > GFP_KERNEL, entry)) {
> >> > folio_put(folio);
> >> > return NULL;
> >>
> >> I think that why not provide "mm" when it's available? For
> >> swapin_direct() called by do_swap_page(), mm can be provided. While,
> >> for swapin_direct() called by shmem swapin, mm will be NULL. We can
> >> even provide "mm" for __read_swap_cache_async() for VMA based swapin and
> >> for the fault address for cluster swapin.
> >
> > Hi, Ying.
> >
> > Thanks for the comment.
> >
> > Without modifying too much code, providing mm here will change swapin
> > charge behaviour on swapoff, we discussed it previously:
> > https://lkml.org/lkml/2023/11/19/320
>
> It's better to use "lore" for kernel email link, for example,
>
> https://lore.kernel.org/lkml/[email protected]/
>
> This is more readable than the above link.
Hi Ying,
Thanks for your advice.
>
> > If we want to avoid the behavior change, we have to extend all direct
> > and indirect callers of mem_cgroup_swapin_charge_folio to accept a
> > standalone mm argument (including swapin_direct, swap_vma_readahead,
> > swap_cluster_readahead, __read_swap_cache_async, swapin_entry_mpol,
> > read_swap_cache_async, and some other path may need more audit), and
> > sort things our case by case. I'm not sure if that's a good idea...
> >
> > Simply dropping it here seems the easiest way to avoid such change.
>
> OK. Didn't realize that they are related directly. It appears that we
> always pass NULL as mm to mem_cgroup_swapin_charge_folio(). If so,
> please just remove the parameter.
Yes, that's also what I wanted.
>
> --
> Best Regards,
> Huang, Ying