2019-11-05 22:04:45

by Yang Shi

[permalink] [raw]
Subject: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()

The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
__GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
write out to swap only in response to memory pressure, so
__GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
or in direct reclaim already.

In addition, XArray node allocations from PF_MEMALLOC contexts could
completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
reserves from being allocated.

Here just copy the gfp flags used by add_to_swap().

Cc: Hugh Dickins <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
mm/shmem.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index 220be9f..9691dec 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
if (list_empty(&info->swaplist))
list_add(&info->swaplist, &shmem_swaplist);

- if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
+ if (add_to_swap_cache(page, swap,
+ __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
spin_lock_irq(&info->lock);
shmem_recalc_inode(inode);
info->swapped++;
--
1.8.3.1


2019-11-06 15:20:16

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()

On Wed 06-11-19 06:02:31, Yang Shi wrote:
> The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
> GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
> __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
> write out to swap only in response to memory pressure, so
> __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
> or in direct reclaim already.

What kind of problem are you trying to fix here?

> In addition, XArray node allocations from PF_MEMALLOC contexts could
> completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
> reserves from being allocated.

I am not really familiar with XArray much, could you be more specific
please?

> Here just copy the gfp flags used by add_to_swap().
>
> Cc: Hugh Dickins <[email protected]>
> Signed-off-by: Yang Shi <[email protected]>
> ---
> mm/shmem.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 220be9f..9691dec 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
> if (list_empty(&info->swaplist))
> list_add(&info->swaplist, &shmem_swaplist);
>
> - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
> + if (add_to_swap_cache(page, swap,
> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
> spin_lock_irq(&info->lock);
> shmem_recalc_inode(inode);
> info->swapped++;
> --
> 1.8.3.1

--
Michal Hocko
SUSE Labs

2019-11-06 18:24:56

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()



On 11/6/19 7:18 AM, Michal Hocko wrote:
> On Wed 06-11-19 06:02:31, Yang Shi wrote:
>> The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
>> GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
>> __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
>> write out to swap only in response to memory pressure, so
>> __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
>> or in direct reclaim already.
> What kind of problem are you trying to fix here?

I didn't run into any visible problem. I just happened to find this
inconsistency when I was looking into the other problem.

The add_to_swap() does:

int add_to_swap(struct page *page)
{
...
err = add_to_swap_cache(page, entry,
                        __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
...
}

Actually, shmem_writepage() does almost the same thing and both of them
are called in reclaim context, so I didn't see why they should use
different gfp flag. And, GFP_ATOMIC is also different from the old
definition as I mentioned in the commit log.

>
>> In addition, XArray node allocations from PF_MEMALLOC contexts could
>> completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
>> reserves from being allocated.
> I am not really familiar with XArray much, could you be more specific
> please?

It comes from the comments of add_to_swap(), says:

/*
         * XArray node allocations from PF_MEMALLOC contexts could
         * completely exhaust the page allocator. __GFP_NOMEMALLOC
         * stops emergency reserves from being allocated.

And, it looks the original comment came from pre-git time, TBH I'm not
quite sure about the specific problem which incurred this. I suspect it
may be because PF_MEMALLOC context allows ALLOC_NO_WATERMARK.

>
>> Here just copy the gfp flags used by add_to_swap().
>>
>> Cc: Hugh Dickins <[email protected]>
>> Signed-off-by: Yang Shi <[email protected]>
>> ---
>> mm/shmem.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index 220be9f..9691dec 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
>> if (list_empty(&info->swaplist))
>> list_add(&info->swaplist, &shmem_swaplist);
>>
>> - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
>> + if (add_to_swap_cache(page, swap,
>> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
>> spin_lock_irq(&info->lock);
>> shmem_recalc_inode(inode);
>> info->swapped++;
>> --
>> 1.8.3.1

2019-11-06 19:01:17

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()

On Wed, 6 Nov 2019, Yang Shi wrote:
> On 11/6/19 7:18 AM, Michal Hocko wrote:
> > On Wed 06-11-19 06:02:31, Yang Shi wrote:
> > > The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
> > > GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
> > > __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
> > > write out to swap only in response to memory pressure, so
> > > __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
> > > or in direct reclaim already.
> > What kind of problem are you trying to fix here?
>
> I didn't run into any visible problem. I just happened to find this
> inconsistency when I was looking into the other problem.

Yes, I don't think it fixes any actual problem: just a cleanup to
make the two calls look the same when they don't need to be different
(whereas the call from __read_swap_cache_async() rightly uses a
lower priority gfp).

If it does fix a problem, then you need to worry also about the
* TODO: this could cause a theoretical memory reclaim
* deadlock in the swap out path.
comment still against the call in add_to_swap(): but I think that
is equally theoretical, demanding no attention since 2.6.12.

>
> The add_to_swap() does:
>
> int add_to_swap(struct page *page)
> {
> ...
> err = add_to_swap_cache(page, entry,
>                         __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
> ...
> }
>
> Actually, shmem_writepage() does almost the same thing and both of them are
> called in reclaim context, so I didn't see why they should use different gfp
> flag. And, GFP_ATOMIC is also different from the old definition as I
> mentioned in the commit log.
>
> >
> > > In addition, XArray node allocations from PF_MEMALLOC contexts could
> > > completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
> > > reserves from being allocated.
> > I am not really familiar with XArray much, could you be more specific
> > please?
>
> It comes from the comments of add_to_swap(), says:
>
> /*
>          * XArray node allocations from PF_MEMALLOC contexts could
>          * completely exhaust the page allocator. __GFP_NOMEMALLOC
>          * stops emergency reserves from being allocated.
>
> And, it looks the original comment came from pre-git time, TBH I'm not quite
> sure about the specific problem which incurred this. I suspect it may be
> because PF_MEMALLOC context allows ALLOC_NO_WATERMARK.
>
> >
> > > Here just copy the gfp flags used by add_to_swap().
> > >
> > > Cc: Hugh Dickins <[email protected]>
> > > Signed-off-by: Yang Shi <[email protected]>
> > > ---
> > > mm/shmem.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/shmem.c b/mm/shmem.c
> > > index 220be9f..9691dec 100644
> > > --- a/mm/shmem.c
> > > +++ b/mm/shmem.c
> > > @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page,
> > > struct writeback_control *wbc)
> > > if (list_empty(&info->swaplist))
> > > list_add(&info->swaplist, &shmem_swaplist);
> > > - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
> > > + if (add_to_swap_cache(page, swap,
> > > + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
> > > spin_lock_irq(&info->lock);
> > > shmem_recalc_inode(inode);
> > > info->swapped++;
> > > --
> > > 1.8.3.1

2019-11-07 20:20:33

by Yang Shi

[permalink] [raw]
Subject: Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()



On 11/6/19 10:59 AM, Hugh Dickins wrote:
> On Wed, 6 Nov 2019, Yang Shi wrote:
>> On 11/6/19 7:18 AM, Michal Hocko wrote:
>>> On Wed 06-11-19 06:02:31, Yang Shi wrote:
>>>> The shmem_writepage() uses GFP_ATOMIC to allocate swap cache.
>>>> GFP_ATOMIC used to mean __GFP_HIGH, but now it means __GFP_HIGH |
>>>> __GFP_ATOMIC | __GFP_KSWAPD_RECLAIM. However, shmem_writepage() should
>>>> write out to swap only in response to memory pressure, so
>>>> __GFP_KSWAPD_RECLAIM looks useless since the caller may be kswapd itself
>>>> or in direct reclaim already.
>>> What kind of problem are you trying to fix here?
>> I didn't run into any visible problem. I just happened to find this
>> inconsistency when I was looking into the other problem.
> Yes, I don't think it fixes any actual problem: just a cleanup to
> make the two calls look the same when they don't need to be different
> (whereas the call from __read_swap_cache_async() rightly uses a
> lower priority gfp).

I'm supposed it is because __read_swap_cache_async()is typically called
from page fault context which is less crucial than reclaim.

Shall I consider this as an ack but with commit log rephrased to reflect
the cleanup?

>
> If it does fix a problem, then you need to worry also about the
> * TODO: this could cause a theoretical memory reclaim
> * deadlock in the swap out path.
> comment still against the call in add_to_swap(): but I think that
> is equally theoretical, demanding no attention since 2.6.12.
>
>> The add_to_swap() does:
>>
>> int add_to_swap(struct page *page)
>> {
>> ...
>> err = add_to_swap_cache(page, entry,
>>                         __GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
>> ...
>> }
>>
>> Actually, shmem_writepage() does almost the same thing and both of them are
>> called in reclaim context, so I didn't see why they should use different gfp
>> flag. And, GFP_ATOMIC is also different from the old definition as I
>> mentioned in the commit log.
>>
>>>> In addition, XArray node allocations from PF_MEMALLOC contexts could
>>>> completely exhaust the page allocator, __GFP_NOMEMALLOC stops emergency
>>>> reserves from being allocated.
>>> I am not really familiar with XArray much, could you be more specific
>>> please?
>> It comes from the comments of add_to_swap(), says:
>>
>> /*
>>          * XArray node allocations from PF_MEMALLOC contexts could
>>          * completely exhaust the page allocator. __GFP_NOMEMALLOC
>>          * stops emergency reserves from being allocated.
>>
>> And, it looks the original comment came from pre-git time, TBH I'm not quite
>> sure about the specific problem which incurred this. I suspect it may be
>> because PF_MEMALLOC context allows ALLOC_NO_WATERMARK.
>>
>>>> Here just copy the gfp flags used by add_to_swap().
>>>>
>>>> Cc: Hugh Dickins <[email protected]>
>>>> Signed-off-by: Yang Shi <[email protected]>
>>>> ---
>>>> mm/shmem.c | 3 ++-
>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>>> index 220be9f..9691dec 100644
>>>> --- a/mm/shmem.c
>>>> +++ b/mm/shmem.c
>>>> @@ -1369,7 +1369,8 @@ static int shmem_writepage(struct page *page,
>>>> struct writeback_control *wbc)
>>>> if (list_empty(&info->swaplist))
>>>> list_add(&info->swaplist, &shmem_swaplist);
>>>> - if (add_to_swap_cache(page, swap, GFP_ATOMIC) == 0) {
>>>> + if (add_to_swap_cache(page, swap,
>>>> + __GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
>>>> spin_lock_irq(&info->lock);
>>>> shmem_recalc_inode(inode);
>>>> info->swapped++;
>>>> --
>>>> 1.8.3.1

2019-11-07 23:01:53

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] mm: shmem: use proper gfp flags for shmem_writepage()

On Thu, 7 Nov 2019, Yang Shi wrote:
> On 11/6/19 10:59 AM, Hugh Dickins wrote:
> > Yes, I don't think it fixes any actual problem: just a cleanup to
> > make the two calls look the same when they don't need to be different
> > (whereas the call from __read_swap_cache_async() rightly uses a
> > lower priority gfp).
>
> I'm supposed it is because __read_swap_cache_async()is typically called from
> page fault context which is less crucial than reclaim.

Exactly.

>
> Shall I consider this as an ack but with commit log rephrased to reflect the
> cleanup?

Okay,
Acked-by: Hugh Dickins <[email protected]>