2021-10-25 19:03:30

by Michal Hocko

[permalink] [raw]
Subject: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags.

From: Michal Hocko <[email protected]>

The core of the vmalloc allocator __vmalloc_area_node doesn't say
anything about gfp mask argument. Not all gfp flags are supported
though. Be more explicit about constrains.

Signed-off-by: Michal Hocko <[email protected]>
---
mm/vmalloc.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 602649919a9d..2199d821c981 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
* @caller: caller's return address
*
* Allocate enough pages to cover @size from the page level
- * allocator with @gfp_mask flags. Map them into contiguous
- * kernel virtual space, using a pagetable protection of @prot.
+ * allocator with @gfp_mask flags. Please note that the full set of gfp
+ * flags are not supported. GFP_KERNEL would be a preferred allocation mode
+ * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
+ * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
+ * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
+ * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
+ * __GFP_NOWARN can be used to suppress error messages about failures.
+ *
+ * Map them into contiguous kernel virtual space, using a pagetable
+ * protection of @prot.
*
* Return: the address of the area or %NULL on failure
*/
--
2.30.2


2021-10-26 02:53:53

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags.

On Tue, 26 Oct 2021, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> The core of the vmalloc allocator __vmalloc_area_node doesn't say
> anything about gfp mask argument. Not all gfp flags are supported
> though. Be more explicit about constrains.
>
> Signed-off-by: Michal Hocko <[email protected]>
> ---
> mm/vmalloc.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 602649919a9d..2199d821c981 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> * @caller: caller's return address
> *
> * Allocate enough pages to cover @size from the page level
> - * allocator with @gfp_mask flags. Map them into contiguous
> - * kernel virtual space, using a pagetable protection of @prot.
> + * allocator with @gfp_mask flags. Please note that the full set of gfp
> + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not

In what sense is GFP_KERNEL "preferred"??
The choice of GFP_NOFS, when necessary, isn't based on preference but
on need.

I understand that you would prefer no one ever used GFP_NOFs ever - just
use the scope API. I even agree. But this is not the place to make
that case.

> + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka

I don't think "aka" is the right thing to use here. It is short for
"also known as" and there is nothing that is being known as something
else.
It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
"i.e." is short for the Latin "id est" which means "that is" and
normally introduces an alternate description (whereas aka introduces an
alternate name).


> + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).

Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.

> + * __GFP_NOWARN can be used to suppress error messages about failures.

Surely "NOWARN" suppresses warning messages, not error messages ....

Thanks,
NeilBrown


> + *
> + * Map them into contiguous kernel virtual space, using a pagetable
> + * protection of @prot.
> *
> * Return: the address of the area or %NULL on failure
> */
> --
> 2.30.2
>
>

2021-10-26 07:29:15

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags.

On Tue 26-10-21 10:26:06, Neil Brown wrote:
> On Tue, 26 Oct 2021, Michal Hocko wrote:
> > From: Michal Hocko <[email protected]>
> >
> > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > anything about gfp mask argument. Not all gfp flags are supported
> > though. Be more explicit about constrains.
> >
> > Signed-off-by: Michal Hocko <[email protected]>
> > ---
> > mm/vmalloc.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 602649919a9d..2199d821c981 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > * @caller: caller's return address
> > *
> > * Allocate enough pages to cover @size from the page level
> > - * allocator with @gfp_mask flags. Map them into contiguous
> > - * kernel virtual space, using a pagetable protection of @prot.
> > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
>
> In what sense is GFP_KERNEL "preferred"??
> The choice of GFP_NOFS, when necessary, isn't based on preference but
> on need.
>
> I understand that you would prefer no one ever used GFP_NOFs ever - just
> use the scope API. I even agree. But this is not the place to make
> that case.

Any suggestion for a better wording?

> > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
>
> I don't think "aka" is the right thing to use here. It is short for
> "also known as" and there is nothing that is being known as something
> else.
> It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> "i.e." is short for the Latin "id est" which means "that is" and
> normally introduces an alternate description (whereas aka introduces an
> alternate name).

OK

> > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
>
> Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.

Because they cannot be passed to the page table allocator. In both cases
the allocation would fail when system is short on memory. GFP_KERNEL
used for ptes implicitly doesn't behave that way.

>
> > + * __GFP_NOWARN can be used to suppress error messages about failures.
>
> Surely "NOWARN" suppresses warning messages, not error messages ....

I am not sure I follow. NOWARN means "do not warn" independently on the
log level chosen for the message. Is an allocation failure an error
message? Is the "vmalloc error: size %lu, failed to map pages" an error
message?

Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages"

Is that better?
--
Michal Hocko
SUSE Labs

2021-10-26 14:13:57

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags.

On Tue, 26 Oct 2021, Michal Hocko wrote:
> On Tue 26-10-21 10:26:06, Neil Brown wrote:
> > On Tue, 26 Oct 2021, Michal Hocko wrote:
> > > From: Michal Hocko <[email protected]>
> > >
> > > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > > anything about gfp mask argument. Not all gfp flags are supported
> > > though. Be more explicit about constrains.
> > >
> > > Signed-off-by: Michal Hocko <[email protected]>
> > > ---
> > > mm/vmalloc.c | 12 ++++++++++--
> > > 1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 602649919a9d..2199d821c981 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > > * @caller: caller's return address
> > > *
> > > * Allocate enough pages to cover @size from the page level
> > > - * allocator with @gfp_mask flags. Map them into contiguous
> > > - * kernel virtual space, using a pagetable protection of @prot.
> > > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> >
> > In what sense is GFP_KERNEL "preferred"??
> > The choice of GFP_NOFS, when necessary, isn't based on preference but
> > on need.
> >
> > I understand that you would prefer no one ever used GFP_NOFs ever - just
> > use the scope API. I even agree. But this is not the place to make
> > that case.
>
> Any suggestion for a better wording?

"GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported".

>
> > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> >
> > I don't think "aka" is the right thing to use here. It is short for
> > "also known as" and there is nothing that is being known as something
> > else.
> > It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> > "i.e." is short for the Latin "id est" which means "that is" and
> > normally introduces an alternate description (whereas aka introduces an
> > alternate name).
>
> OK
>
> > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> >
> > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.
>
> Because they cannot be passed to the page table allocator. In both cases
> the allocation would fail when system is short on memory. GFP_KERNEL
> used for ptes implicitly doesn't behave that way.

Could you please point me to the particular allocation which uses
GFP_KERNEL rather than the flags passed to __vmalloc_node()? I cannot
find it.

>
> >
> > > + * __GFP_NOWARN can be used to suppress error messages about failures.
> >
> > Surely "NOWARN" suppresses warning messages, not error messages ....
>
> I am not sure I follow. NOWARN means "do not warn" independently on the
> log level chosen for the message. Is an allocation failure an error
> message? Is the "vmalloc error: size %lu, failed to map pages" an error
> message?

If guess working with a C compiler has trained me to think that
"warnings" are different from "errors".

>
> Anyway I will go with "__GFP_NOWARN can be used to suppress failure messages"
>
> Is that better?

Yes, that's an excellent solution! Thanks.

NeilBrown


> --
> Michal Hocko
> SUSE Labs
>
>

2021-10-26 16:13:29

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH 3/4] mm/vmalloc: be more explicit about supported gfp flags.

On Tue 26-10-21 21:43:17, Neil Brown wrote:
> On Tue, 26 Oct 2021, Michal Hocko wrote:
> > On Tue 26-10-21 10:26:06, Neil Brown wrote:
> > > On Tue, 26 Oct 2021, Michal Hocko wrote:
> > > > From: Michal Hocko <[email protected]>
> > > >
> > > > The core of the vmalloc allocator __vmalloc_area_node doesn't say
> > > > anything about gfp mask argument. Not all gfp flags are supported
> > > > though. Be more explicit about constrains.
> > > >
> > > > Signed-off-by: Michal Hocko <[email protected]>
> > > > ---
> > > > mm/vmalloc.c | 12 ++++++++++--
> > > > 1 file changed, 10 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > > index 602649919a9d..2199d821c981 100644
> > > > --- a/mm/vmalloc.c
> > > > +++ b/mm/vmalloc.c
> > > > @@ -2980,8 +2980,16 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
> > > > * @caller: caller's return address
> > > > *
> > > > * Allocate enough pages to cover @size from the page level
> > > > - * allocator with @gfp_mask flags. Map them into contiguous
> > > > - * kernel virtual space, using a pagetable protection of @prot.
> > > > + * allocator with @gfp_mask flags. Please note that the full set of gfp
> > > > + * flags are not supported. GFP_KERNEL would be a preferred allocation mode
> > > > + * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
> > >
> > > In what sense is GFP_KERNEL "preferred"??
> > > The choice of GFP_NOFS, when necessary, isn't based on preference but
> > > on need.
> > >
> > > I understand that you would prefer no one ever used GFP_NOFs ever - just
> > > use the scope API. I even agree. But this is not the place to make
> > > that case.
> >
> > Any suggestion for a better wording?
>
> "GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all supported".

OK. Check the incremental update at the end of the email

> > > > + * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
> > > > + * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
> > >
> > > I don't think "aka" is the right thing to use here. It is short for
> > > "also known as" and there is nothing that is being known as something
> > > else.
> > > It would be appropriate to say (i.e. GFP_NOWAIT is not supported).
> > > "i.e." is short for the Latin "id est" which means "that is" and
> > > normally introduces an alternate description (whereas aka introduces an
> > > alternate name).
> >
> > OK
> >
> > > > + * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
> > >
> > > Why do you think __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported.
> >
> > Because they cannot be passed to the page table allocator. In both cases
> > the allocation would fail when system is short on memory. GFP_KERNEL
> > used for ptes implicitly doesn't behave that way.
>
> Could you please point me to the particular allocation which uses
> GFP_KERNEL rather than the flags passed to __vmalloc_node()? I cannot
> find it.
>

It is dug
__vmalloc_area_node
vmap_pages_range
vmap_pages_range_noflush
vmap_range_noflush || vmap_small_pages_range_noflush
vmap_p4d_range
p4d_alloc_track
__p4d_alloc
p4d_alloc_one
get_zeroed_page(GFP_KERNEL_ACCOUNT)

the same applies for all other levels of page tables.

This is what I have currently
commit ae7fc6c2ef6949a76d697fc61bb350197dfca330
Author: Michal Hocko <[email protected]>
Date: Tue Oct 26 14:16:32 2021 +0200

fold me "mm/vmalloc: be more explicit about supported gfp flags."

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 2ddaa9410aee..82a07b04317e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2981,12 +2981,14 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
*
* Allocate enough pages to cover @size from the page level
* allocator with @gfp_mask flags. Please note that the full set of gfp
- * flags are not supported. GFP_KERNEL would be a preferred allocation mode
- * but GFP_NOFS and GFP_NOIO are supported as well. Zone modifiers are not
- * supported. From the reclaim modifiers__GFP_DIRECT_RECLAIM is required (aka
- * GFP_NOWAIT is not supported) and only __GFP_NOFAIL is supported (aka
- * __GFP_NORETRY and __GFP_RETRY_MAYFAIL are not supported).
- * __GFP_NOWARN can be used to suppress error messages about failures.
+ * flags are not supported. GFP_KERNEL, GFP_NOFS, and GFP_NOIO are all
+ * supported.
+ * Zone modifiers are not supported. From the reclaim modifiers
+ * __GFP_DIRECT_RECLAIM is required (aka GFP_NOWAIT is not supported)
+ * and only __GFP_NOFAIL is supported (i.e. __GFP_NORETRY and
+ * __GFP_RETRY_MAYFAIL are not supported).
+ *
+ * __GFP_NOWARN can be used to suppress failures messages.
*
* Map them into contiguous kernel virtual space, using a pagetable
* protection of @prot.
--
Michal Hocko
SUSE Labs