LinuxLists.cc - Re : .... get_page_from_freelist : MInority Suggestion to accept GFP

2009-07-07 00:02:43

Subject: Re : .... get_page_from_freelist : MInority Suggestion to accept GFP_NOFAIL accept during boot

Group,

Make plain text so vger.kernel.org will accept this..

Mitchell Erblich
====================

On Jul 6, 2009, at 4:56 PM, Mitchell Erblich wrote:

> David,
>
> The web page http://lkml.indiana.edu/hypermail/linux/kernel/
>
> Looking at the thread of emails on June 24 at 11:07:23
> upcoming kerneloops.org item: get_page_from_freelist
>
> We have code from Arjan de Van
>
> it's this warning in mm/page_alloc.c:
>
> * __GFP_NOFAIL is not to be used in new code.
> *
> * All __GFP_NOFAIL callers should be fixed so that they
> * properly detect and handle allocation failures.
> *
> * We most definitely don't want callers attempting to
> * allocate greater than single-page units with
> * __GFP_NOFAIL.
> */
> WARN_ON_ONCE(order > 0);
>
>
> Mitchell Erblich
> ======================
>
> On Jul 3, 2009, at 2:01 AM, David Rientjes wrote:
>
>> On Thu, 2 Jul 2009, Mitchell Erblich wrote:
>>
>>> Group,
>>>
>>>
>>> If I may suggest a minority opinion about the depreciating of the
>>> GFP_NOFAIL flag..
>>>
>>> I saw no discussion on the acceptance of using this flag during
>>> boot
>>> and shortly
>>> after boot.
>>>
>>> Many kernel structures require memory and thus should guarantee
>>> memory
>>> before they continue.
>>>
>>> As Linux is moved within embedded environments with smaller
>>> amounts of
>>> physical memory, the chance that earlier mem failures becomes
>>> higher.
>>>
>>> For this logic alone, my minority opinion is to not depreciate the
>>> GFP_NOFAIL flag.
>>>
>>
>> I'm confused by your request because all allocations with orders
>> under
>> PAGE_ALLOC_COSTLY_ORDER are inherently __GFP_NOFAIL and those that
>> are not
>> can easily implement the same behavior in the caller:
>>
>> struct page *page;
>> do {
>> page = alloc_pages(...);
>> } while (!page);
>>
>> Hopefully something could be done to ensure the next call to
>> alloc_pages()
>> would be more likely to succeed, but __GFP_NOFAIL doesn't provide
>> that
>> anyway.
>

2009-07-07 03:38:03

by David Rientjes

[permalink] [raw]

Subject: Re: Re : .... get_page_from_freelist : MInority Suggestion to accept GFP_NOFAIL accept during boot

On Mon, 6 Jul 2009, Mitchell Erblich wrote:

> David,
>
> The web page http://lkml.indiana.edu/hypermail/linux/kernel/
>
> Looking at the thread of emails on June 24 at 11:07:23
> upcoming kerneloops.org item: get_page_from_freelist
>
> We have code from Arjan de Van
>
> it's this warning in mm/page_alloc.c:
>
> * __GFP_NOFAIL is not to be used in new code.
> *
> * All __GFP_NOFAIL callers should be fixed so that they
> * properly detect and handle allocation failures.
> *
> * We most definitely don't want callers attempting to
> * allocate greater than single-page units with
> * __GFP_NOFAIL.
> */
> WARN_ON_ONCE(order > 0);
>

[ That's actually Andrew's code and comment, which has since been changed
to

WARN_ON_ONCE(order > 1);

by Linus. ]

Your suggestion to revert this "deprecation" doesn't make sense, though,
given the workarounds I mentioned earlier:

> On Jul 3, 2009, at 2:01 AM, David Rientjes wrote:
>
> > I'm confused by your request because all allocations with orders under
> > PAGE_ALLOC_COSTLY_ORDER are inherently __GFP_NOFAIL and those that are not
> > can easily implement the same behavior in the caller:
> >
> > struct page *page;
> > do {
> > page = alloc_pages(...);
> > } while (!page);
> >
> > Hopefully something could be done to ensure the next call to alloc_pages()
> > would be more likely to succeed, but __GFP_NOFAIL doesn't provide that
> > anyway.

That means anything that less than or equal to PAGE_ALLOC_COSTLY_ORDER
(order-3 allocations) will already loop endlessly, regardless of whether
__GFP_NOFAIL is passed to the page allocator or not. Secondly, you can
use my code above to replicate the exact behavior of __GFP_NOFAIL in the
caller.

In other words, the page allocator doesn't need to implement any special
handling for __GFP_NOFAIL.

2009-07-07 05:23:22

by Mitchell Erblich

[permalink] [raw]

Subject: Re: Re : .... get_page_from_freelist : MInority Suggestion to accept GFP_NOFAIL accept during boot

David,

After this email, if you want to keep
up the emails on this thread, then lets
push it private.

My arg was independent of mem / page order,
and only was that IMO the OS infrastructure during
bootup needs this FLAG.

My assumption is via the buddy list that if an order fails
that means all supported orders greater will then also
fail. ie: assuming GFP_NOFAIL and global mem:
Yes, we need SLEEP but knowing that ATOMICs
are also failing COULD be interesting.

So, maybe a log entry should be forced with ATOMICs and
NOFAIL. Yes, limit output with repeat counts.

Mitchell Erblich
===============

On Jul 6, 2009, at 8:37 PM, David Rientjes wrote:

> On Mon, 6 Jul 2009, Mitchell Erblich wrote:
>
>> David,
>>
>> The web page http://lkml.indiana.edu/hypermail/linux/kernel/
>>
>> Looking at the thread of emails on June 24 at 11:07:23
>> upcoming kerneloops.org item: get_page_from_freelist
>>
>> We have code from Arjan de Van
>>
>> it's this warning in mm/page_alloc.c:
>>
>> * __GFP_NOFAIL is not to be used in new code.
>> *
>> * All __GFP_NOFAIL callers should be fixed so that they
>> * properly detect and handle allocation failures.
>> *
>> * We most definitely don't want callers attempting to
>> * allocate greater than single-page units with
>> * __GFP_NOFAIL.
>> */
>> WARN_ON_ONCE(order > 0);
>>
>
> [ That's actually Andrew's code and comment, which has since been
> changed
> to
>
> WARN_ON_ONCE(order > 1);
>
> by Linus. ]
>
> Your suggestion to revert this "deprecation" doesn't make sense,
> though,
> given the workarounds I mentioned earlier:
>
>> On Jul 3, 2009, at 2:01 AM, David Rientjes wrote:
>>
>>> I'm confused by your request because all allocations with orders
>>> under
>>> PAGE_ALLOC_COSTLY_ORDER are inherently __GFP_NOFAIL and those that
>>> are not
>>> can easily implement the same behavior in the caller:
>>>
>>> struct page *page;
>>> do {
>>> page = alloc_pages(...);
>>> } while (!page);
>>>
>>> Hopefully something could be done to ensure the next call to
>>> alloc_pages()
>>> would be more likely to succeed, but __GFP_NOFAIL doesn't provide
>>> that
>>> anyway.
>
> That means anything that less than or equal to PAGE_ALLOC_COSTLY_ORDER
> (order-3 allocations) will already loop endlessly, regardless of
> whether
> __GFP_NOFAIL is passed to the page allocator or not. Secondly, you
> can
> use my code above to replicate the exact behavior of __GFP_NOFAIL in
> the
> caller.
>
> In other words, the page allocator doesn't need to implement any
> special
> handling for __GFP_NOFAIL.